Re: Data Centre recovery/replication, does this seem plausible?
Yeah, reality gets in the way of simple solutions a lot. And making it even more fun you'd really want to only bring up one node for each shard in the broken DC and let that one be fully synched. Then bring up the replicas in a controlled fashion so you didn't saturate the local network with replications. And then you'd. But as Shawn says, this is certainly functionality that would be waaay cool, there's just been no time to make it all work, the main folks who've been working in this area all have a mountain of higher-priority stuff to get done first There's been talk of making SolrCloud rack aware which could extend into some kind of work in this area, but that's also on the future plate. As you're well aware it's not a trivial problem! Hmmm, what you really want here is the ability to say to a recovering cluster do your initial synch using nodes that the ZK ensemble located at XXX know about, then switch to your very own ensemble. Something like a remote recovery option. Which is _still_ kind of tricky, I sure hope you have identical sharding schemes. FWIW, Erick On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 10:48 AM, Daniel Collins wrote: What ideally I would like to do is at the point that I kick off recovery, divert the indexing feed for the broken into a transaction log on those machines, run the replication and swap the index in, then replay the transaction log to bring it all up to date. That process (conceptually) is the same as the org.apache.solr.cloud.**RecoveryStrategy code. I don't think any such mechanism exists currently. It would be extremely awesome if it did. If there's not an existing Jira issue, I recommend that you file one. Being able to set up a multi-datacenter cloud with automatic recovery would be awesome. Even if it took a long time, having it be fully automated would be exceptionally useful. Yes, if I could divert that feed a that application level, then I can do what you suggest, but it feels like more work to do that (and build an external transaction log) whereas the code seems to already be in Solr itself, I just need to hook it all up (famous last words!) Our indexing pipeline does a lot of pre-processing work (its not just pulling data from a database), and since we are only talking about the time taken to do the replication (should be an hour or less), it feels like we ought to be able to store that in a Solr transaction log (i.e. the last point in the indexing pipeline). I think it would have to be a separate transaction log. One problem with really big regular tlogs is that when Solr gets restarted, the entire transaction log that's currently on the disk gets replayed. If it were big enough to recover the last several hours to a duplicate cloud, it would take forever to replay on Solr restart. If the regular tlog were kept small but a second log with the last 24 hours were available, it could replay updates when the second cloud came back up. I do import from a database, so the application-level tracking works really well for me. Thanks, Shawn
Re: Data Centre recovery/replication, does this seem plausible?
On Aug 28, 2013, at 8:59 AM, Erick Erickson erickerick...@gmail.com wrote: When a replica discovers that it's too far out of date, it does an old-style replication. IOW, the tlog doesn't contain the entire delta. Eventually, the old-style replications catch up to close enough and _then_ the remaining docs in the tlog are replayed. The target number of updates in the tlog is 100 so it's a pretty small window that's actually replayed in the normal case. Daniel had it right I think - first a node starts buffering all incoming updates, then it replicates the index, buffering all updates during that replication, then it replays all those updates from the buffer. No 'target' number of updates applies here. - Mark
Re: Data Centre recovery/replication, does this seem plausible?
Here is a really different approach. Make the two data centers one Solr Cloud cluster and use a third data center (or EC2 region) for one additional Zookeeper node. When you lose a DC, Zookeeper still functions. There would be more traffic between datacenters. wunder On Aug 29, 2013, at 4:11 AM, Erick Erickson wrote: Yeah, reality gets in the way of simple solutions a lot. And making it even more fun you'd really want to only bring up one node for each shard in the broken DC and let that one be fully synched. Then bring up the replicas in a controlled fashion so you didn't saturate the local network with replications. And then you'd. But as Shawn says, this is certainly functionality that would be waaay cool, there's just been no time to make it all work, the main folks who've been working in this area all have a mountain of higher-priority stuff to get done first There's been talk of making SolrCloud rack aware which could extend into some kind of work in this area, but that's also on the future plate. As you're well aware it's not a trivial problem! Hmmm, what you really want here is the ability to say to a recovering cluster do your initial synch using nodes that the ZK ensemble located at XXX know about, then switch to your very own ensemble. Something like a remote recovery option. Which is _still_ kind of tricky, I sure hope you have identical sharding schemes. FWIW, Erick On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 10:48 AM, Daniel Collins wrote: What ideally I would like to do is at the point that I kick off recovery, divert the indexing feed for the broken into a transaction log on those machines, run the replication and swap the index in, then replay the transaction log to bring it all up to date. That process (conceptually) is the same as the org.apache.solr.cloud.**RecoveryStrategy code. I don't think any such mechanism exists currently. It would be extremely awesome if it did. If there's not an existing Jira issue, I recommend that you file one. Being able to set up a multi-datacenter cloud with automatic recovery would be awesome. Even if it took a long time, having it be fully automated would be exceptionally useful. Yes, if I could divert that feed a that application level, then I can do what you suggest, but it feels like more work to do that (and build an external transaction log) whereas the code seems to already be in Solr itself, I just need to hook it all up (famous last words!) Our indexing pipeline does a lot of pre-processing work (its not just pulling data from a database), and since we are only talking about the time taken to do the replication (should be an hour or less), it feels like we ought to be able to store that in a Solr transaction log (i.e. the last point in the indexing pipeline). I think it would have to be a separate transaction log. One problem with really big regular tlogs is that when Solr gets restarted, the entire transaction log that's currently on the disk gets replayed. If it were big enough to recover the last several hours to a duplicate cloud, it would take forever to replay on Solr restart. If the regular tlog were kept small but a second log with the last 24 hours were available, it could replay updates when the second cloud came back up. I do import from a database, so the application-level tracking works really well for me. Thanks, Shawn -- Walter Underwood wun...@wunderwood.org
Re: Data Centre recovery/replication, does this seem plausible?
Someone really needs to test this with EC2 availability zones. I haven't had the time, but I know other clustered NoSQL solutions like HBase and Cassandra can deal with it. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Thu, Aug 29, 2013 at 12:20 PM, Walter Underwood wun...@wunderwood.orgwrote: Here is a really different approach. Make the two data centers one Solr Cloud cluster and use a third data center (or EC2 region) for one additional Zookeeper node. When you lose a DC, Zookeeper still functions. There would be more traffic between datacenters. wunder On Aug 29, 2013, at 4:11 AM, Erick Erickson wrote: Yeah, reality gets in the way of simple solutions a lot. And making it even more fun you'd really want to only bring up one node for each shard in the broken DC and let that one be fully synched. Then bring up the replicas in a controlled fashion so you didn't saturate the local network with replications. And then you'd. But as Shawn says, this is certainly functionality that would be waaay cool, there's just been no time to make it all work, the main folks who've been working in this area all have a mountain of higher-priority stuff to get done first There's been talk of making SolrCloud rack aware which could extend into some kind of work in this area, but that's also on the future plate. As you're well aware it's not a trivial problem! Hmmm, what you really want here is the ability to say to a recovering cluster do your initial synch using nodes that the ZK ensemble located at XXX know about, then switch to your very own ensemble. Something like a remote recovery option. Which is _still_ kind of tricky, I sure hope you have identical sharding schemes. FWIW, Erick On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 10:48 AM, Daniel Collins wrote: What ideally I would like to do is at the point that I kick off recovery, divert the indexing feed for the broken into a transaction log on those machines, run the replication and swap the index in, then replay the transaction log to bring it all up to date. That process (conceptually) is the same as the org.apache.solr.cloud.**RecoveryStrategy code. I don't think any such mechanism exists currently. It would be extremely awesome if it did. If there's not an existing Jira issue, I recommend that you file one. Being able to set up a multi-datacenter cloud with automatic recovery would be awesome. Even if it took a long time, having it be fully automated would be exceptionally useful. Yes, if I could divert that feed a that application level, then I can do what you suggest, but it feels like more work to do that (and build an external transaction log) whereas the code seems to already be in Solr itself, I just need to hook it all up (famous last words!) Our indexing pipeline does a lot of pre-processing work (its not just pulling data from a database), and since we are only talking about the time taken to do the replication (should be an hour or less), it feels like we ought to be able to store that in a Solr transaction log (i.e. the last point in the indexing pipeline). I think it would have to be a separate transaction log. One problem with really big regular tlogs is that when Solr gets restarted, the entire transaction log that's currently on the disk gets replayed. If it were big enough to recover the last several hours to a duplicate cloud, it would take forever to replay on Solr restart. If the regular tlog were kept small but a second log with the last 24 hours were available, it could replay updates when the second cloud came back up. I do import from a database, so the application-level tracking works really well for me. Thanks, Shawn -- Walter Underwood wun...@wunderwood.org
Re: Data Centre recovery/replication, does this seem plausible?
Walter, yes we did consider this (and might be having a 3rd DC for other reasons anyway), but 3 DCs also offers the possibility of running with 2 down and 1 up which ZK still can't handle :) There is also a second advantage to keeping our clouds separate, they are independent, which means if we have a problem and the collection gets corrupted, we have a live backup we can flip to on the other DC. If they were all part of 1 cloud, Solr keeps them all consistent, which is good until we get a bug in the cloud and it breaks everything in 1 fell swoop! On 29 August 2013 17:35, Michael Della Bitta michael.della.bi...@appinions.com wrote: Someone really needs to test this with EC2 availability zones. I haven't had the time, but I know other clustered NoSQL solutions like HBase and Cassandra can deal with it. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Thu, Aug 29, 2013 at 12:20 PM, Walter Underwood wun...@wunderwood.org wrote: Here is a really different approach. Make the two data centers one Solr Cloud cluster and use a third data center (or EC2 region) for one additional Zookeeper node. When you lose a DC, Zookeeper still functions. There would be more traffic between datacenters. wunder On Aug 29, 2013, at 4:11 AM, Erick Erickson wrote: Yeah, reality gets in the way of simple solutions a lot. And making it even more fun you'd really want to only bring up one node for each shard in the broken DC and let that one be fully synched. Then bring up the replicas in a controlled fashion so you didn't saturate the local network with replications. And then you'd. But as Shawn says, this is certainly functionality that would be waaay cool, there's just been no time to make it all work, the main folks who've been working in this area all have a mountain of higher-priority stuff to get done first There's been talk of making SolrCloud rack aware which could extend into some kind of work in this area, but that's also on the future plate. As you're well aware it's not a trivial problem! Hmmm, what you really want here is the ability to say to a recovering cluster do your initial synch using nodes that the ZK ensemble located at XXX know about, then switch to your very own ensemble. Something like a remote recovery option. Which is _still_ kind of tricky, I sure hope you have identical sharding schemes. FWIW, Erick On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 10:48 AM, Daniel Collins wrote: What ideally I would like to do is at the point that I kick off recovery, divert the indexing feed for the broken into a transaction log on those machines, run the replication and swap the index in, then replay the transaction log to bring it all up to date. That process (conceptually) is the same as the org.apache.solr.cloud.**RecoveryStrategy code. I don't think any such mechanism exists currently. It would be extremely awesome if it did. If there's not an existing Jira issue, I recommend that you file one. Being able to set up a multi-datacenter cloud with automatic recovery would be awesome. Even if it took a long time, having it be fully automated would be exceptionally useful. Yes, if I could divert that feed a that application level, then I can do what you suggest, but it feels like more work to do that (and build an external transaction log) whereas the code seems to already be in Solr itself, I just need to hook it all up (famous last words!) Our indexing pipeline does a lot of pre-processing work (its not just pulling data from a database), and since we are only talking about the time taken to do the replication (should be an hour or less), it feels like we ought to be able to store that in a Solr transaction log (i.e. the last point in the indexing pipeline). I think it would have to be a separate transaction log. One problem with really big regular tlogs is that when Solr gets restarted, the entire transaction log that's currently on the disk gets replayed. If it were big enough to recover the last several hours to a duplicate cloud, it would take forever to replay on Solr restart. If the regular tlog were kept small but a second log with the last 24 hours were available, it could replay updates when the second cloud came back up. I do import from a database, so the application-level tracking works really well for me. Thanks,
Re: Data Centre recovery/replication, does this seem plausible?
Hmm, ya learn something new every day, thanks for the correction. On Thu, Aug 29, 2013 at 10:23 AM, Mark Miller markrmil...@gmail.com wrote: On Aug 28, 2013, at 8:59 AM, Erick Erickson erickerick...@gmail.com wrote: When a replica discovers that it's too far out of date, it does an old-style replication. IOW, the tlog doesn't contain the entire delta. Eventually, the old-style replications catch up to close enough and _then_ the remaining docs in the tlog are replayed. The target number of updates in the tlog is 100 so it's a pretty small window that's actually replayed in the normal case. Daniel had it right I think - first a node starts buffering all incoming updates, then it replicates the index, buffering all updates during that replication, then it replays all those updates from the buffer. No 'target' number of updates applies here. - Mark
Data Centre recovery/replication, does this seem plausible?
We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? In essence, we want to do what happens within Solr cloud's recovery, so (as I understand cloud recovery) a node starts up, (I'm assuming worst case and peer sync has failed) then buffers all updates into the transaction log, replicates from the leader, and replays the transaction log to get everything in sync. Is it conceivable to do the same by extending Solr, so on the activation of some handler (user triggered), we initiated a replicate from other DC, which puts all the leaders into buffering updates, replicate from some other set of servers and then replay? Our goal is to try to minimize the downtime (beyond the initial outage), so we would ideally like to be able to start up indexing before this replicate/clone has finished, that's why I thought to enable buffering on the transaction log. Searches shouldn't be sent here, but if they do we have a valid (albeit old) index to serve those until the new one swaps in. Just curious how any other DC-aware setups handle this kind of scenario? Or other concerns, issues with this type of approach.
Re: Data Centre recovery/replication, does this seem plausible?
The separate DC problem has been lurking for a while. But your understanding it a little off. When a replica discovers that it's too far out of date, it does an old-style replication. IOW, the tlog doesn't contain the entire delta. Eventually, the old-style replications catch up to close enough and _then_ the remaining docs in the tlog are replayed. The target number of updates in the tlog is 100 so it's a pretty small window that's actually replayed in the normal case. None of which helps your problem. The simplest way (and on the expectation that DC outages were pretty rare!) would be to have your indexing process fire the missed updates at the DC after it came back up. Copying from one DC to another is tricky. You'd have to be very, very sure that you copied indexes to the right shard. Ditto for any process that tried to have, say, a single node from the recovering DC temporarily join the good DC, at least long enough to synch. Not a pretty problem, we don't really have any best practices yet that I know of. FWIW, Erick On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.comwrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? In essence, we want to do what happens within Solr cloud's recovery, so (as I understand cloud recovery) a node starts up, (I'm assuming worst case and peer sync has failed) then buffers all updates into the transaction log, replicates from the leader, and replays the transaction log to get everything in sync. Is it conceivable to do the same by extending Solr, so on the activation of some handler (user triggered), we initiated a replicate from other DC, which puts all the leaders into buffering updates, replicate from some other set of servers and then replay? Our goal is to try to minimize the downtime (beyond the initial outage), so we would ideally like to be able to start up indexing before this replicate/clone has finished, that's why I thought to enable buffering on the transaction log. Searches shouldn't be sent here, but if they do we have a valid (albeit old) index to serve those until the new one swaps in. Just curious how any other DC-aware setups handle this kind of scenario? Or other concerns, issues with this type of approach.
Re: Data Centre recovery/replication, does this seem plausible?
I've been thinking about this one too and was curious about using the Solr Entity support in the DIH to do the import from one DC to another (for the lost docs). In my mind, one configures the DIH to use the SolrEntityProcessor with a query to capture the docs in the DC that stayed online, most likely using a timestamp in the query (see: http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor). Would that work? If so, any downsides? I've only used DIH / SolrEntityProcessor to populate a staging / dev environment from prod but have had good success with it. Thanks. Tim On Wed, Aug 28, 2013 at 6:59 AM, Erick Erickson erickerick...@gmail.comwrote: The separate DC problem has been lurking for a while. But your understanding it a little off. When a replica discovers that it's too far out of date, it does an old-style replication. IOW, the tlog doesn't contain the entire delta. Eventually, the old-style replications catch up to close enough and _then_ the remaining docs in the tlog are replayed. The target number of updates in the tlog is 100 so it's a pretty small window that's actually replayed in the normal case. None of which helps your problem. The simplest way (and on the expectation that DC outages were pretty rare!) would be to have your indexing process fire the missed updates at the DC after it came back up. Copying from one DC to another is tricky. You'd have to be very, very sure that you copied indexes to the right shard. Ditto for any process that tried to have, say, a single node from the recovering DC temporarily join the good DC, at least long enough to synch. Not a pretty problem, we don't really have any best practices yet that I know of. FWIW, Erick On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.com wrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? In essence, we want to do what happens within Solr cloud's recovery, so (as I understand cloud recovery) a node starts up, (I'm assuming worst case and peer sync has failed) then buffers all updates into the transaction log, replicates from the leader, and replays the transaction log to get everything in sync. Is it conceivable to do the same by extending Solr, so on the activation of some handler (user triggered), we initiated a replicate from other DC, which puts all the leaders into buffering updates, replicate from some other set of servers and then replay? Our goal is to try to minimize the downtime (beyond the initial outage), so we would ideally like to be able to start up indexing before this replicate/clone has finished, that's why I thought to enable buffering on the transaction log. Searches shouldn't be sent here, but if they do we have a valid (albeit old) index to serve those until the new one swaps in. Just curious how any other DC-aware setups handle this kind of scenario? Or other concerns, issues with this type of approach.
Re: Data Centre recovery/replication, does this seem plausible?
If you can satisfy this statement then it seems possible. This is the same restirction as atomic updates.: The SolrEntityProcessor can only copy fields that are stored in the source index. On Wed, Aug 28, 2013 at 9:41 AM, Timothy Potter thelabd...@gmail.comwrote: I've been thinking about this one too and was curious about using the Solr Entity support in the DIH to do the import from one DC to another (for the lost docs). In my mind, one configures the DIH to use the SolrEntityProcessor with a query to capture the docs in the DC that stayed online, most likely using a timestamp in the query (see: http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor). Would that work? If so, any downsides? I've only used DIH / SolrEntityProcessor to populate a staging / dev environment from prod but have had good success with it. Thanks. Tim On Wed, Aug 28, 2013 at 6:59 AM, Erick Erickson erickerick...@gmail.com wrote: The separate DC problem has been lurking for a while. But your understanding it a little off. When a replica discovers that it's too far out of date, it does an old-style replication. IOW, the tlog doesn't contain the entire delta. Eventually, the old-style replications catch up to close enough and _then_ the remaining docs in the tlog are replayed. The target number of updates in the tlog is 100 so it's a pretty small window that's actually replayed in the normal case. None of which helps your problem. The simplest way (and on the expectation that DC outages were pretty rare!) would be to have your indexing process fire the missed updates at the DC after it came back up. Copying from one DC to another is tricky. You'd have to be very, very sure that you copied indexes to the right shard. Ditto for any process that tried to have, say, a single node from the recovering DC temporarily join the good DC, at least long enough to synch. Not a pretty problem, we don't really have any best practices yet that I know of. FWIW, Erick On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.com wrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? In essence, we want to do what happens within Solr cloud's recovery, so (as I understand cloud recovery) a node starts up, (I'm assuming worst case and peer sync has failed) then buffers all updates into the transaction log, replicates from the leader, and replays the transaction log to get everything in sync. Is it conceivable to do the same by extending Solr, so on the activation of some handler (user triggered), we initiated a replicate from other DC, which puts all the leaders into buffering updates, replicate from some other set of servers and then replay? Our goal is to try to minimize the downtime (beyond the initial outage), so we would ideally like to be able to start up indexing before this replicate/clone has finished, that's why I thought to enable buffering on the transaction log. Searches shouldn't be sent here, but if they do we have a valid (albeit old) index to serve those until the new one swaps in. Just curious how any other DC-aware setups handle this kind of scenario? Or other concerns, issues with this type of approach.
RE: Data Centre recovery/replication, does this seem plausible?
Hi - You're going to miss unstored but indexed fields. We stop any indexing process, kill the servlets on the down DC and copy over the files using scp, then remove the lock file and start it up again. Always works but it's a manual process at this point but should be easy to automate using some simple bash scripting. -Original message- From:Timothy Potter thelabd...@gmail.com Sent: Wednesday 28th August 2013 15:41 To: solr-user@lucene.apache.org Subject: Re: Data Centre recovery/replication, does this seem plausible? I've been thinking about this one too and was curious about using the Solr Entity support in the DIH to do the import from one DC to another (for the lost docs). In my mind, one configures the DIH to use the SolrEntityProcessor with a query to capture the docs in the DC that stayed online, most likely using a timestamp in the query (see: http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor). Would that work? If so, any downsides? I've only used DIH / SolrEntityProcessor to populate a staging / dev environment from prod but have had good success with it. Thanks. Tim On Wed, Aug 28, 2013 at 6:59 AM, Erick Erickson erickerick...@gmail.comwrote: The separate DC problem has been lurking for a while. But your understanding it a little off. When a replica discovers that it's too far out of date, it does an old-style replication. IOW, the tlog doesn't contain the entire delta. Eventually, the old-style replications catch up to close enough and _then_ the remaining docs in the tlog are replayed. The target number of updates in the tlog is 100 so it's a pretty small window that's actually replayed in the normal case. None of which helps your problem. The simplest way (and on the expectation that DC outages were pretty rare!) would be to have your indexing process fire the missed updates at the DC after it came back up. Copying from one DC to another is tricky. You'd have to be very, very sure that you copied indexes to the right shard. Ditto for any process that tried to have, say, a single node from the recovering DC temporarily join the good DC, at least long enough to synch. Not a pretty problem, we don't really have any best practices yet that I know of. FWIW, Erick On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.com wrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? In essence, we want to do what happens within Solr cloud's recovery, so (as I understand cloud recovery) a node starts up, (I'm assuming worst case and peer sync has failed) then buffers all updates into the transaction log, replicates from the leader, and replays the transaction log to get everything in sync. Is it conceivable to do the same by extending Solr, so on the activation of some handler (user triggered), we initiated a replicate from other DC, which puts all the leaders into buffering updates, replicate from some other set of servers and then replay? Our goal is to try to minimize the downtime (beyond the initial outage), so we would ideally like to be able to start up indexing before this replicate/clone has finished, that's why I thought to enable buffering on the transaction log. Searches shouldn't be sent here, but if they do we have a valid (albeit old) index to serve those until the new one swaps in. Just curious how any other DC-aware setups handle this kind of scenario? Or other concerns, issues with this type of approach.
Re: Data Centre recovery/replication, does this seem plausible?
On 8/28/2013 6:13 AM, Daniel Collins wrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? One way which would work (if your core name structures were identical between the two clouds) would be to shut down your indexing process, shut down the cloud that went down and has now come back up, and rsync from the good cloud. Depending on the index size, that could take a long time, and the index updates would be turned off while it's happening. That makes this idea less than ideal. I have a similar setup on a sharded index that's NOT using SolrCloud, and both copies are in one location instead of two separate data centers. My general indexing method would work for your setup, though. The way that I handle this is that my indexing program tracks its update position for each copy of the index independently. If one copy is down, the tracked position for that index won't get updated, so the next time it comes up, all missed updates will get done for that copy. In the meantime, the program (Java, using SolrJ) is happily using a separate thread to continue updating the index copy that's still up. Thanks, Shawn
Re: Data Centre recovery/replication, does this seem plausible?
Thanks Shawn/Erick for the suggestions. Unfortunately stopping indexing whilst we recover isn't a viable option, we are using Solr as an NRT search platform, so indexing must continue at least on the DC that is fine. If we could stop indexing on the broken DC, then recovery is relatively straightforward, its a rsync/copy of a snapshot from the other data center followed by restarting indexing. The million dollar question is how to start up our existing Solr instances (once the data center has recovered from whatever broke it), realize that we have a gap in indexing (using a checkpointing mechanism similar to what Shawn describes), and recover from that (that's the tricky bit!), without having to interrupt indexing... I know that replication takes up to an hour (its a rather large collection but split into 8 shards currently, and we can replicate each shard in parallel). What ideally I would like to do is at the point that I kick off recovery, divert the indexing feed for the broken into a transaction log on those machines, run the replication and swap the index in, then replay the transaction log to bring it all up to date. That process (conceptually) is the same as the org.apache.solr.cloud.RecoveryStrategy code. Yes, if I could divert that feed a that application level, then I can do what you suggest, but it feels like more work to do that (and build an external transaction log) whereas the code seems to already be in Solr itself, I just need to hook it all up (famous last words!) Our indexing pipeline does a lot of pre-processing work (its not just pulling data from a database), and since we are only talking about the time taken to do the replication (should be an hour or less), it feels like we ought to be able to store that in a Solr transaction log (i.e. the last point in the indexing pipeline). The plan would be to recover the leaders (1 of each shard) this way, and then use conventional replication/recovery to deal with the local replicas (blank their data area and then they will automatically sync from the local leader). On 28 August 2013 15:26, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 6:13 AM, Daniel Collins wrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go down, and when they come back up, we need some way to recover the lost data. Our thought was to replicate from the working DC, but is there a way to do that whilst still maintaining an online presence for indexing purposes? One way which would work (if your core name structures were identical between the two clouds) would be to shut down your indexing process, shut down the cloud that went down and has now come back up, and rsync from the good cloud. Depending on the index size, that could take a long time, and the index updates would be turned off while it's happening. That makes this idea less than ideal. I have a similar setup on a sharded index that's NOT using SolrCloud, and both copies are in one location instead of two separate data centers. My general indexing method would work for your setup, though. The way that I handle this is that my indexing program tracks its update position for each copy of the index independently. If one copy is down, the tracked position for that index won't get updated, so the next time it comes up, all missed updates will get done for that copy. In the meantime, the program (Java, using SolrJ) is happily using a separate thread to continue updating the index copy that's still up. Thanks, Shawn
Re: Data Centre recovery/replication, does this seem plausible?
On 8/28/2013 10:48 AM, Daniel Collins wrote: What ideally I would like to do is at the point that I kick off recovery, divert the indexing feed for the broken into a transaction log on those machines, run the replication and swap the index in, then replay the transaction log to bring it all up to date. That process (conceptually) is the same as the org.apache.solr.cloud.RecoveryStrategy code. I don't think any such mechanism exists currently. It would be extremely awesome if it did. If there's not an existing Jira issue, I recommend that you file one. Being able to set up a multi-datacenter cloud with automatic recovery would be awesome. Even if it took a long time, having it be fully automated would be exceptionally useful. Yes, if I could divert that feed a that application level, then I can do what you suggest, but it feels like more work to do that (and build an external transaction log) whereas the code seems to already be in Solr itself, I just need to hook it all up (famous last words!) Our indexing pipeline does a lot of pre-processing work (its not just pulling data from a database), and since we are only talking about the time taken to do the replication (should be an hour or less), it feels like we ought to be able to store that in a Solr transaction log (i.e. the last point in the indexing pipeline). I think it would have to be a separate transaction log. One problem with really big regular tlogs is that when Solr gets restarted, the entire transaction log that's currently on the disk gets replayed. If it were big enough to recover the last several hours to a duplicate cloud, it would take forever to replay on Solr restart. If the regular tlog were kept small but a second log with the last 24 hours were available, it could replay updates when the second cloud came back up. I do import from a database, so the application-level tracking works really well for me. Thanks, Shawn