Re: Solution to Solr Master-Slave single point of failure.
Hi Alessandro. We are considering many options... including SolrCloud. Thanks. Arcadius. On 17 June 2014 14:54, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Hello Arcadius, why not simple moving to SolrCloud that already addresses fault tolerance and high availability ? Simply imagine a configuration of : 1 shard, factor of replciation 3. And you have even a better scenario than 2 masters and 1 slave. Cheers 2014-06-17 14:43 GMT+01:00 Arcadius Ahouansou arcad...@menelic.com: Hello. SolrCloud has been out for a while now. However, there are still many installations running Solr4 in the traditional master-slave setup. Currently, the Solr Master is the single point of failure of most master-slave deployment. This could be easily addressed by having : a- 2 independents Solr Masters running side-by-side and being fed simultaneously, b- all slaves configured with masterUrl=masterUrl01,masterUrl02 (needs to be implemented) c- by default, masterUrl01 will be used by all slaves. d- When the slaves catch an exception (like NoRouteToHostException or ConnectionTimedOutException etc), they will retry a couple of times before switching to using masterUrl02. I suppose you have thought about this issue before. So I would like to know whether there are issues with such a simple solution. This could also help deploy Solr across 2 different data-centers. Thank you very much. Arcadius. -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
Re: Solution to Solr Master-Slave single point of failure.
Thanks Shawn and Erick for the valuable information. Just wondering: Is there a way to use the rest API to set the masterUrl on the slave ( or change masterUrl without restarting)? Checked http://wiki.apache.org/solr/SolrReplication#HTTP_API without success. Thanks. Arcadius. On 19 June 2014 02:39, Shawn Heisey s...@elyograg.org wrote: I thought about this situation and I must admit that it's a tricky one. We should offer the option to configure the slaves to swit Yes, I do understand that SolrCloud is the future. However, removing this singlePointOfFailure from the traditional master-slave deployment model would not require a lot of effort IMHO and would give huge benefit in term of choice. The other question is: How many sites are on SolrCloud? How many are still on master-slave? For my redundant Solr install, I don't use replication and I don't use SolrCloud. At one time, I had a master/slave replication setup running Solr 1.4.1. When 3.1.0 came out, I wanted to upgrade. The problem was that it's not possible to replicate between 1.4.1 and any later release, because the javabin format changed in 3.1.0 and does not offer any backwards compatibility. What we were forced to do is rewrite the index updating software so that it would update both copies of the distributed index in parallel. By the time I had this completed, 3.2.0 was out and that was what we upgraded to. This offers us capabilities that cannot be matched by a distributed/replicated SolrCloud. We are able to try out a different configuration and/or schema on one copy of our index, without affecting the index used by the production applications. We can upgrade or change any component on only one copy of the index. When such a change is made for testing purposes, we can independently rebuild either copy of the index from scratch, and through Solr's ping handler, can instantly switch the active copy of the index from the point of view of the load balancer. A dev version of the application can be pointed at the offline copy of the index. Because SolrCloud shares the configuration between all replicas, and master/slave replication copies the actual index, treating each copy of the index as an independent entity in these scenarios is not possible. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
Re: Solution to Solr Master-Slave single point of failure.
Some further observation, 2014-06-19 1:40 GMT+01:00 Arcadius Ahouansou arcad...@menelic.com: Hello Erick. On 17 June 2014 16:52, Erick Erickson erickerick...@gmail.com wrote: The sticky parts of that solution (off the top of my head) are assuring that the two masters have all the updates. How do you guarantee that the updates succeed and the two masters do, indeed, contain the exact same information? Let's assume the simple case when the Solr DIH is being used to pull data from a central RDBMS to each master every 15min. In that case, both master should be in sync even if the index versions are different. A monitoring system could be used to periodically check doc count is almost the same on both masters. I don't understand the point and the almost the same. We need the same docs, no space here for almost . To have this I suggest Master 02 to be actually a slave of master 01. In this way we are sure the indexes are aligned. There'd have to be logic to insure that when the switch was made, the entire index was replicated. How would the slave know which segments to replicate from the master? Especially since the segments would NOT be identical, the slaves would have to replicate the entire index... In the event of a switch-over, I would expect the slaves fetching the whole/full index from master02 In production, the monitoring system should also alert the support team. What to do when the first master came back up? Which one should be the one true source? We have 2 options here: - either stay on master02 until a human intervention (rest API reset or restart of master02), or - switch back to master01 automatically Why don't put behind a Virtual IP the current master ? the slave will not know who is fetching. In case of disaster we switch behind the Virtual IP the master01 with master02( that was already perfectly aligned as a repetitor) The whole question of all the slaves knowing what master to ping is actually pretty ambiguous. What happens if slave 1 pings master1 and there's a temporary network glitch so it switches to master2. Meanwhile, due to timing, slave2 thinks master1 is still online. How to detect/track this? I thought about this situation and I must admit that it's a tricky one. We should offer the option to configure the slaves to switch let's say only after N failures (configurable) or after retrying for a configurable period of time. When you start to spin these scenarios, you start needing some kind of cluster state accessible to all slaves, and then you start thinking about ZooKeeper and you're swiftly back to SolrCloud. The thinking in traditional Solr M/S situations avoids having two masters, if a master dies you promote one of the slaves to be the new master. The tricky bit here is to re-index data from before the time the old master died to the new master. So far, that's been good enough for M/S setups, and then SolrCloud came along so I suspect not much effort would be put into something like what you suggest; the effort should be towards hardening SolrCloud... Yes, I do understand that SolrCloud is the future. However, removing this singlePointOfFailure from the traditional master-slave deployment model would not require a lot of effort IMHO and would give huge benefit in term of choice. The other question is: How many sites are on SolrCloud? How many are still on master-slave? Thank you very much. Arcadius. Best, Erick On Tue, Jun 17, 2014 at 6:54 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Hello Arcadius, why not simple moving to SolrCloud that already addresses fault tolerance and high availability ? Simply imagine a configuration of : 1 shard, factor of replciation 3. And you have even a better scenario than 2 masters and 1 slave. Cheers 2014-06-17 14:43 GMT+01:00 Arcadius Ahouansou arcad...@menelic.com: Hello. SolrCloud has been out for a while now. However, there are still many installations running Solr4 in the traditional master-slave setup. Currently, the Solr Master is the single point of failure of most master-slave deployment. This could be easily addressed by having : a- 2 independents Solr Masters running side-by-side and being fed simultaneously, b- all slaves configured with masterUrl=masterUrl01,masterUrl02 (needs to be implemented) c- by default, masterUrl01 will be used by all slaves. d- When the slaves catch an exception (like NoRouteToHostException or ConnectionTimedOutException etc), they will retry a couple of times before switching to using masterUrl02. I suppose you have thought about this issue before. So I would like to know whether there are issues with such a simple solution. This could also help deploy Solr across 2 different data-centers. Thank you very much. Arcadius. -- -- Benedetti
Re: Solution to Solr Master-Slave single point of failure.
Hello Erick. On 17 June 2014 16:52, Erick Erickson erickerick...@gmail.com wrote: The sticky parts of that solution (off the top of my head) are assuring that the two masters have all the updates. How do you guarantee that the updates succeed and the two masters do, indeed, contain the exact same information? Let's assume the simple case when the Solr DIH is being used to pull data from a central RDBMS to each master every 15min. In that case, both master should be in sync even if the index versions are different. A monitoring system could be used to periodically check doc count is almost the same on both masters. There'd have to be logic to insure that when the switch was made, the entire index was replicated. How would the slave know which segments to replicate from the master? Especially since the segments would NOT be identical, the slaves would have to replicate the entire index... In the event of a switch-over, I would expect the slaves fetching the whole/full index from master02 In production, the monitoring system should also alert the support team. What to do when the first master came back up? Which one should be the one true source? We have 2 options here: - either stay on master02 until a human intervention (rest API reset or restart of master02), or - switch back to master01 automatically The whole question of all the slaves knowing what master to ping is actually pretty ambiguous. What happens if slave 1 pings master1 and there's a temporary network glitch so it switches to master2. Meanwhile, due to timing, slave2 thinks master1 is still online. How to detect/track this? I thought about this situation and I must admit that it's a tricky one. We should offer the option to configure the slaves to switch let's say only after N failures (configurable) or after retrying for a configurable period of time. When you start to spin these scenarios, you start needing some kind of cluster state accessible to all slaves, and then you start thinking about ZooKeeper and you're swiftly back to SolrCloud. The thinking in traditional Solr M/S situations avoids having two masters, if a master dies you promote one of the slaves to be the new master. The tricky bit here is to re-index data from before the time the old master died to the new master. So far, that's been good enough for M/S setups, and then SolrCloud came along so I suspect not much effort would be put into something like what you suggest; the effort should be towards hardening SolrCloud... Yes, I do understand that SolrCloud is the future. However, removing this singlePointOfFailure from the traditional master-slave deployment model would not require a lot of effort IMHO and would give huge benefit in term of choice. The other question is: How many sites are on SolrCloud? How many are still on master-slave? Thank you very much. Arcadius. Best, Erick On Tue, Jun 17, 2014 at 6:54 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Hello Arcadius, why not simple moving to SolrCloud that already addresses fault tolerance and high availability ? Simply imagine a configuration of : 1 shard, factor of replciation 3. And you have even a better scenario than 2 masters and 1 slave. Cheers 2014-06-17 14:43 GMT+01:00 Arcadius Ahouansou arcad...@menelic.com: Hello. SolrCloud has been out for a while now. However, there are still many installations running Solr4 in the traditional master-slave setup. Currently, the Solr Master is the single point of failure of most master-slave deployment. This could be easily addressed by having : a- 2 independents Solr Masters running side-by-side and being fed simultaneously, b- all slaves configured with masterUrl=masterUrl01,masterUrl02 (needs to be implemented) c- by default, masterUrl01 will be used by all slaves. d- When the slaves catch an exception (like NoRouteToHostException or ConnectionTimedOutException etc), they will retry a couple of times before switching to using masterUrl02. I suppose you have thought about this issue before. So I would like to know whether there are issues with such a simple solution. This could also help deploy Solr across 2 different data-centers. Thank you very much. Arcadius. -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Arcadius Ahouansou Menelic Ltd | Information is Power M: 07908761999 W: www.menelic.com ---
Re: Solution to Solr Master-Slave single point of failure.
I thought about this situation and I must admit that it's a tricky one. We should offer the option to configure the slaves to swit Yes, I do understand that SolrCloud is the future. However, removing this singlePointOfFailure from the traditional master-slave deployment model would not require a lot of effort IMHO and would give huge benefit in term of choice. The other question is: How many sites are on SolrCloud? How many are still on master-slave? For my redundant Solr install, I don't use replication and I don't use SolrCloud. At one time, I had a master/slave replication setup running Solr 1.4.1. When 3.1.0 came out, I wanted to upgrade. The problem was that it's not possible to replicate between 1.4.1 and any later release, because the javabin format changed in 3.1.0 and does not offer any backwards compatibility. What we were forced to do is rewrite the index updating software so that it would update both copies of the distributed index in parallel. By the time I had this completed, 3.2.0 was out and that was what we upgraded to. This offers us capabilities that cannot be matched by a distributed/replicated SolrCloud. We are able to try out a different configuration and/or schema on one copy of our index, without affecting the index used by the production applications. We can upgrade or change any component on only one copy of the index. When such a change is made for testing purposes, we can independently rebuild either copy of the index from scratch, and through Solr's ping handler, can instantly switch the active copy of the index from the point of view of the load balancer. A dev version of the application can be pointed at the offline copy of the index. Because SolrCloud shares the configuration between all replicas, and master/slave replication copies the actual index, treating each copy of the index as an independent entity in these scenarios is not possible. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solution to Solr Master-Slave single point of failure.
Hello Arcadius, why not simple moving to SolrCloud that already addresses fault tolerance and high availability ? Simply imagine a configuration of : 1 shard, factor of replciation 3. And you have even a better scenario than 2 masters and 1 slave. Cheers 2014-06-17 14:43 GMT+01:00 Arcadius Ahouansou arcad...@menelic.com: Hello. SolrCloud has been out for a while now. However, there are still many installations running Solr4 in the traditional master-slave setup. Currently, the Solr Master is the single point of failure of most master-slave deployment. This could be easily addressed by having : a- 2 independents Solr Masters running side-by-side and being fed simultaneously, b- all slaves configured with masterUrl=masterUrl01,masterUrl02 (needs to be implemented) c- by default, masterUrl01 will be used by all slaves. d- When the slaves catch an exception (like NoRouteToHostException or ConnectionTimedOutException etc), they will retry a couple of times before switching to using masterUrl02. I suppose you have thought about this issue before. So I would like to know whether there are issues with such a simple solution. This could also help deploy Solr across 2 different data-centers. Thank you very much. Arcadius. -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: Solution to Solr Master-Slave single point of failure.
The sticky parts of that solution (off the top of my head) are assuring that the two masters have all the updates. How do you guarantee that the updates succeed and the two masters do, indeed, contain the exact same information? There'd have to be logic to insure that when the switch was made, the entire index was replicated. How would the slave know which segments to replicate from the master? Especially since the segments would NOT be identical, the slaves would have to replicate the entire index... What to do when the first master came back up? Which one should be the one true source? The whole question of all the slaves knowing what master to ping is actually pretty ambiguous. What happens if slave 1 pings master1 and there's a temporary network glitch so it switches to master2. Meanwhile, due to timing, slave2 thinks master1 is still online. How to detect/track this? When you start to spin these scenarios, you start needing some kind of cluster state accessible to all slaves, and then you start thinking about ZooKeeper and you're swiftly back to SolrCloud. The thinking in traditional Solr M/S situations avoids having two masters, if a master dies you promote one of the slaves to be the new master. The tricky bit here is to re-index data from before the time the old master died to the new master. So far, that's been good enough for M/S setups, and then SolrCloud came along so I suspect not much effort would be put into something like what you suggest; the effort should be towards hardening SolrCloud... Best, Erick On Tue, Jun 17, 2014 at 6:54 AM, Alessandro Benedetti benedetti.ale...@gmail.com wrote: Hello Arcadius, why not simple moving to SolrCloud that already addresses fault tolerance and high availability ? Simply imagine a configuration of : 1 shard, factor of replciation 3. And you have even a better scenario than 2 masters and 1 slave. Cheers 2014-06-17 14:43 GMT+01:00 Arcadius Ahouansou arcad...@menelic.com: Hello. SolrCloud has been out for a while now. However, there are still many installations running Solr4 in the traditional master-slave setup. Currently, the Solr Master is the single point of failure of most master-slave deployment. This could be easily addressed by having : a- 2 independents Solr Masters running side-by-side and being fed simultaneously, b- all slaves configured with masterUrl=masterUrl01,masterUrl02 (needs to be implemented) c- by default, masterUrl01 will be used by all slaves. d- When the slaves catch an exception (like NoRouteToHostException or ConnectionTimedOutException etc), they will retry a couple of times before switching to using masterUrl02. I suppose you have thought about this issue before. So I would like to know whether there are issues with such a simple solution. This could also help deploy Solr across 2 different data-centers. Thank you very much. Arcadius. -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org