Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Erick Erickson
Yeah, reality gets in the way of simple solutions a lot.

And making it even more fun you'd really want to only
bring up one node for each shard in the broken DC and
let that one be fully synched. Then bring up the replicas
in a controlled fashion so you didn't saturate the local
network with replications. And then you'd.

But as Shawn says, this is certainly functionality that
would be waaay cool, there's just been no time to
make it all work, the main folks who've been working
in this area all have a mountain of higher-priority
stuff to get done first

There's been talk of making SolrCloud rack aware which
could extend into some kind of work in this area, but
that's also on the future plate. As you're well aware
it's not a trivial problem!

Hmmm, what you really want here is the ability to say
to a recovering cluster do your initial synch using nodes
that the ZK ensemble located at XXX know about, then
switch to your very own ensemble. Something like a
remote recovery option. Which is _still_ kind of
tricky, I sure hope you have identical sharding schemes.

FWIW,
Erick


On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote:

 On 8/28/2013 10:48 AM, Daniel Collins wrote:

 What ideally I would like to do
 is at the point that I kick off recovery, divert the indexing feed for the
 broken into a transaction log on those machines, run the replication and
 swap the index in, then replay the transaction log to bring it all up to
 date.  That process (conceptually)  is the same as the
 org.apache.solr.cloud.**RecoveryStrategy code.


 I don't think any such mechanism exists currently.  It would be extremely
 awesome if it did.  If there's not an existing Jira issue, I recommend that
 you file one.  Being able to set up a multi-datacenter cloud with automatic
 recovery would be awesome.  Even if it took a long time, having it be fully
 automated would be exceptionally useful.


  Yes, if I could divert that feed a that application level, then I can do
 what you suggest, but it feels like more work to do that (and build an
 external transaction log) whereas the code seems to already be in Solr
 itself, I just need to hook it all up (famous last words!) Our indexing
 pipeline does a lot of pre-processing work (its not just pulling data from
 a database), and since we are only talking about the time taken to do the
 replication (should be an hour or less), it feels like we ought to be able
 to store that in a Solr transaction log (i.e. the last point in the
 indexing pipeline).


 I think it would have to be a separate transaction log.  One problem with
 really big regular tlogs is that when Solr gets restarted, the entire
 transaction log that's currently on the disk gets replayed.  If it were big
 enough to recover the last several hours to a duplicate cloud, it would
 take forever to replay on Solr restart.  If the regular tlog were kept
 small but a second log with the last 24 hours were available, it could
 replay updates when the second cloud came back up.

 I do import from a database, so the application-level tracking works
 really well for me.

 Thanks,
 Shawn




Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Mark Miller

On Aug 28, 2013, at 8:59 AM, Erick Erickson erickerick...@gmail.com wrote:

 When a replica discovers that
 it's too far out of date, it does an old-style replication. IOW, the
 tlog doesn't contain the entire delta. Eventually, the old-style
 replications catch up to close enough and _then_ the remaining
 docs in the tlog are replayed. The target number of updates in the
 tlog is 100 so it's a pretty small window that's actually replayed in
 the normal case.

Daniel had it right I think - first a node starts buffering all incoming 
updates, then it replicates the index, buffering all updates during that 
replication, then it replays all those updates from the buffer. No 'target' 
number of updates applies here.

- Mark



Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Walter Underwood
Here is a really different approach.

Make the two data centers one Solr Cloud cluster and use a third data center 
(or EC2 region) for one additional Zookeeper node. When you lose a DC, 
Zookeeper still functions.

There would be more traffic between datacenters.

wunder

On Aug 29, 2013, at 4:11 AM, Erick Erickson wrote:

 Yeah, reality gets in the way of simple solutions a lot.
 
 And making it even more fun you'd really want to only
 bring up one node for each shard in the broken DC and
 let that one be fully synched. Then bring up the replicas
 in a controlled fashion so you didn't saturate the local
 network with replications. And then you'd.
 
 But as Shawn says, this is certainly functionality that
 would be waaay cool, there's just been no time to
 make it all work, the main folks who've been working
 in this area all have a mountain of higher-priority
 stuff to get done first
 
 There's been talk of making SolrCloud rack aware which
 could extend into some kind of work in this area, but
 that's also on the future plate. As you're well aware
 it's not a trivial problem!
 
 Hmmm, what you really want here is the ability to say
 to a recovering cluster do your initial synch using nodes
 that the ZK ensemble located at XXX know about, then
 switch to your very own ensemble. Something like a
 remote recovery option. Which is _still_ kind of
 tricky, I sure hope you have identical sharding schemes.
 
 FWIW,
 Erick
 
 
 On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote:
 
 On 8/28/2013 10:48 AM, Daniel Collins wrote:
 
 What ideally I would like to do
 is at the point that I kick off recovery, divert the indexing feed for the
 broken into a transaction log on those machines, run the replication and
 swap the index in, then replay the transaction log to bring it all up to
 date.  That process (conceptually)  is the same as the
 org.apache.solr.cloud.**RecoveryStrategy code.
 
 
 I don't think any such mechanism exists currently.  It would be extremely
 awesome if it did.  If there's not an existing Jira issue, I recommend that
 you file one.  Being able to set up a multi-datacenter cloud with automatic
 recovery would be awesome.  Even if it took a long time, having it be fully
 automated would be exceptionally useful.
 
 
 Yes, if I could divert that feed a that application level, then I can do
 what you suggest, but it feels like more work to do that (and build an
 external transaction log) whereas the code seems to already be in Solr
 itself, I just need to hook it all up (famous last words!) Our indexing
 pipeline does a lot of pre-processing work (its not just pulling data from
 a database), and since we are only talking about the time taken to do the
 replication (should be an hour or less), it feels like we ought to be able
 to store that in a Solr transaction log (i.e. the last point in the
 indexing pipeline).
 
 
 I think it would have to be a separate transaction log.  One problem with
 really big regular tlogs is that when Solr gets restarted, the entire
 transaction log that's currently on the disk gets replayed.  If it were big
 enough to recover the last several hours to a duplicate cloud, it would
 take forever to replay on Solr restart.  If the regular tlog were kept
 small but a second log with the last 24 hours were available, it could
 replay updates when the second cloud came back up.
 
 I do import from a database, so the application-level tracking works
 really well for me.
 
 Thanks,
 Shawn
 
 

--
Walter Underwood
wun...@wunderwood.org





Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Michael Della Bitta
Someone really needs to test this with EC2 availability zones. I haven't
had the time, but I know other clustered NoSQL solutions like HBase and
Cassandra can deal with it.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062  | c: +1 917 477 7906

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Thu, Aug 29, 2013 at 12:20 PM, Walter Underwood wun...@wunderwood.orgwrote:

 Here is a really different approach.

 Make the two data centers one Solr Cloud cluster and use a third data
 center (or EC2 region) for one additional Zookeeper node. When you lose a
 DC, Zookeeper still functions.

 There would be more traffic between datacenters.

 wunder

 On Aug 29, 2013, at 4:11 AM, Erick Erickson wrote:

  Yeah, reality gets in the way of simple solutions a lot.
 
  And making it even more fun you'd really want to only
  bring up one node for each shard in the broken DC and
  let that one be fully synched. Then bring up the replicas
  in a controlled fashion so you didn't saturate the local
  network with replications. And then you'd.
 
  But as Shawn says, this is certainly functionality that
  would be waaay cool, there's just been no time to
  make it all work, the main folks who've been working
  in this area all have a mountain of higher-priority
  stuff to get done first
 
  There's been talk of making SolrCloud rack aware which
  could extend into some kind of work in this area, but
  that's also on the future plate. As you're well aware
  it's not a trivial problem!
 
  Hmmm, what you really want here is the ability to say
  to a recovering cluster do your initial synch using nodes
  that the ZK ensemble located at XXX know about, then
  switch to your very own ensemble. Something like a
  remote recovery option. Which is _still_ kind of
  tricky, I sure hope you have identical sharding schemes.
 
  FWIW,
  Erick
 
 
  On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote:
 
  On 8/28/2013 10:48 AM, Daniel Collins wrote:
 
  What ideally I would like to do
  is at the point that I kick off recovery, divert the indexing feed for
 the
  broken into a transaction log on those machines, run the replication
 and
  swap the index in, then replay the transaction log to bring it all up
 to
  date.  That process (conceptually)  is the same as the
  org.apache.solr.cloud.**RecoveryStrategy code.
 
 
  I don't think any such mechanism exists currently.  It would be
 extremely
  awesome if it did.  If there's not an existing Jira issue, I recommend
 that
  you file one.  Being able to set up a multi-datacenter cloud with
 automatic
  recovery would be awesome.  Even if it took a long time, having it be
 fully
  automated would be exceptionally useful.
 
 
  Yes, if I could divert that feed a that application level, then I can do
  what you suggest, but it feels like more work to do that (and build an
  external transaction log) whereas the code seems to already be in Solr
  itself, I just need to hook it all up (famous last words!) Our indexing
  pipeline does a lot of pre-processing work (its not just pulling data
 from
  a database), and since we are only talking about the time taken to do
 the
  replication (should be an hour or less), it feels like we ought to be
 able
  to store that in a Solr transaction log (i.e. the last point in the
  indexing pipeline).
 
 
  I think it would have to be a separate transaction log.  One problem
 with
  really big regular tlogs is that when Solr gets restarted, the entire
  transaction log that's currently on the disk gets replayed.  If it were
 big
  enough to recover the last several hours to a duplicate cloud, it would
  take forever to replay on Solr restart.  If the regular tlog were kept
  small but a second log with the last 24 hours were available, it could
  replay updates when the second cloud came back up.
 
  I do import from a database, so the application-level tracking works
  really well for me.
 
  Thanks,
  Shawn
 
 

 --
 Walter Underwood
 wun...@wunderwood.org






Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Daniel Collins
Walter, yes we did consider this (and might be having a 3rd DC for other
reasons anyway), but 3 DCs also offers the possibility of running with 2
down and 1 up which ZK still can't handle :)

There is also a second advantage to keeping our clouds separate, they are
independent, which means if we have a problem and the collection gets
corrupted, we have a live backup we can flip to on the other DC.  If they
were all part of 1 cloud, Solr keeps them all consistent, which is good
until we get a bug in the cloud and it breaks everything in 1 fell swoop!


On 29 August 2013 17:35, Michael Della Bitta 
michael.della.bi...@appinions.com wrote:

 Someone really needs to test this with EC2 availability zones. I haven't
 had the time, but I know other clustered NoSQL solutions like HBase and
 Cassandra can deal with it.

 Michael Della Bitta

 Applications Developer

 o: +1 646 532 3062  | c: +1 917 477 7906

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
 w: appinions.com http://www.appinions.com/


 On Thu, Aug 29, 2013 at 12:20 PM, Walter Underwood wun...@wunderwood.org
 wrote:

  Here is a really different approach.
 
  Make the two data centers one Solr Cloud cluster and use a third data
  center (or EC2 region) for one additional Zookeeper node. When you lose a
  DC, Zookeeper still functions.
 
  There would be more traffic between datacenters.
 
  wunder
 
  On Aug 29, 2013, at 4:11 AM, Erick Erickson wrote:
 
   Yeah, reality gets in the way of simple solutions a lot.
  
   And making it even more fun you'd really want to only
   bring up one node for each shard in the broken DC and
   let that one be fully synched. Then bring up the replicas
   in a controlled fashion so you didn't saturate the local
   network with replications. And then you'd.
  
   But as Shawn says, this is certainly functionality that
   would be waaay cool, there's just been no time to
   make it all work, the main folks who've been working
   in this area all have a mountain of higher-priority
   stuff to get done first
  
   There's been talk of making SolrCloud rack aware which
   could extend into some kind of work in this area, but
   that's also on the future plate. As you're well aware
   it's not a trivial problem!
  
   Hmmm, what you really want here is the ability to say
   to a recovering cluster do your initial synch using nodes
   that the ZK ensemble located at XXX know about, then
   switch to your very own ensemble. Something like a
   remote recovery option. Which is _still_ kind of
   tricky, I sure hope you have identical sharding schemes.
  
   FWIW,
   Erick
  
  
   On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org
 wrote:
  
   On 8/28/2013 10:48 AM, Daniel Collins wrote:
  
   What ideally I would like to do
   is at the point that I kick off recovery, divert the indexing feed
 for
  the
   broken into a transaction log on those machines, run the
 replication
  and
   swap the index in, then replay the transaction log to bring it all up
  to
   date.  That process (conceptually)  is the same as the
   org.apache.solr.cloud.**RecoveryStrategy code.
  
  
   I don't think any such mechanism exists currently.  It would be
  extremely
   awesome if it did.  If there's not an existing Jira issue, I recommend
  that
   you file one.  Being able to set up a multi-datacenter cloud with
  automatic
   recovery would be awesome.  Even if it took a long time, having it be
  fully
   automated would be exceptionally useful.
  
  
   Yes, if I could divert that feed a that application level, then I can
 do
   what you suggest, but it feels like more work to do that (and build
 an
   external transaction log) whereas the code seems to already be in
 Solr
   itself, I just need to hook it all up (famous last words!) Our
 indexing
   pipeline does a lot of pre-processing work (its not just pulling data
  from
   a database), and since we are only talking about the time taken to do
  the
   replication (should be an hour or less), it feels like we ought to be
  able
   to store that in a Solr transaction log (i.e. the last point in the
   indexing pipeline).
  
  
   I think it would have to be a separate transaction log.  One problem
  with
   really big regular tlogs is that when Solr gets restarted, the entire
   transaction log that's currently on the disk gets replayed.  If it
 were
  big
   enough to recover the last several hours to a duplicate cloud, it
 would
   take forever to replay on Solr restart.  If the regular tlog were kept
   small but a second log with the last 24 hours were available, it could
   replay updates when the second cloud came back up.
  
   I do import from a database, so the application-level tracking works
   really well for me.
  
   Thanks,
   

Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Erick Erickson
Hmm, ya learn something new every day, thanks for the correction.


On Thu, Aug 29, 2013 at 10:23 AM, Mark Miller markrmil...@gmail.com wrote:


 On Aug 28, 2013, at 8:59 AM, Erick Erickson erickerick...@gmail.com
 wrote:

  When a replica discovers that
  it's too far out of date, it does an old-style replication. IOW, the
  tlog doesn't contain the entire delta. Eventually, the old-style
  replications catch up to close enough and _then_ the remaining
  docs in the tlog are replayed. The target number of updates in the
  tlog is 100 so it's a pretty small window that's actually replayed in
  the normal case.

 Daniel had it right I think - first a node starts buffering all incoming
 updates, then it replicates the index, buffering all updates during that
 replication, then it replays all those updates from the buffer. No 'target'
 number of updates applies here.

 - Mark




Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Daniel Collins
We have 2 separate data centers in our organisation, and in order to
maintain the ZK quorum during any DC outage, we have 2 separate Solr
clouds, one in each DC with separate ZK ensembles but both are fed with the
same indexing data.

Now in the event of a DC outage, all our Solr instances go down, and when
they come back up, we need some way to recover the lost data.

Our thought was to replicate from the working DC, but is there a way to do
that whilst still maintaining an online presence for indexing purposes?

In essence, we want to do what happens within Solr cloud's recovery, so (as
I understand cloud recovery) a node starts up, (I'm assuming worst case and
peer sync has failed) then buffers all updates into the transaction log,
replicates from the leader, and replays the transaction log to get
everything in sync.

Is it conceivable to do the same by extending Solr, so on the activation of
some handler (user triggered), we initiated a replicate from other DC,
which puts all the leaders into buffering updates, replicate from some
other set of servers and then replay?

Our goal is to try to minimize the downtime (beyond the initial outage), so
we would ideally like to be able to start up indexing before this
replicate/clone has finished, that's why I thought to enable buffering on
the transaction log.  Searches shouldn't be sent here, but if they do we
have a valid (albeit old) index to serve those until the new one swaps in.

Just curious how any other DC-aware setups handle this kind of scenario?
 Or other concerns, issues with this type of approach.


Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Erick Erickson
The separate DC problem has been lurking for a while. But your
understanding it a little off. When a replica discovers that
it's too far out of date, it does an old-style replication. IOW, the
tlog doesn't contain the entire delta. Eventually, the old-style
replications catch up to close enough and _then_ the remaining
docs in the tlog are replayed. The target number of updates in the
tlog is 100 so it's a pretty small window that's actually replayed in
the normal case.

None of which helps your problem. The simplest way (and on the
expectation that DC outages were pretty rare!) would be to have your
indexing process fire the missed updates at the DC after it came
back up.

Copying from one DC to another is tricky. You'd have to be very,
very sure that you copied indexes to the right shard. Ditto for any
process that tried to have, say, a single node from the recovering
DC temporarily join the good DC, at least long enough to synch.

Not a pretty problem, we don't really have any best practices yet
that I know of.

FWIW,
Erick


On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.comwrote:

 We have 2 separate data centers in our organisation, and in order to
 maintain the ZK quorum during any DC outage, we have 2 separate Solr
 clouds, one in each DC with separate ZK ensembles but both are fed with the
 same indexing data.

 Now in the event of a DC outage, all our Solr instances go down, and when
 they come back up, we need some way to recover the lost data.

 Our thought was to replicate from the working DC, but is there a way to do
 that whilst still maintaining an online presence for indexing purposes?

 In essence, we want to do what happens within Solr cloud's recovery, so (as
 I understand cloud recovery) a node starts up, (I'm assuming worst case and
 peer sync has failed) then buffers all updates into the transaction log,
 replicates from the leader, and replays the transaction log to get
 everything in sync.

 Is it conceivable to do the same by extending Solr, so on the activation of
 some handler (user triggered), we initiated a replicate from other DC,
 which puts all the leaders into buffering updates, replicate from some
 other set of servers and then replay?

 Our goal is to try to minimize the downtime (beyond the initial outage), so
 we would ideally like to be able to start up indexing before this
 replicate/clone has finished, that's why I thought to enable buffering on
 the transaction log.  Searches shouldn't be sent here, but if they do we
 have a valid (albeit old) index to serve those until the new one swaps in.

 Just curious how any other DC-aware setups handle this kind of scenario?
  Or other concerns, issues with this type of approach.



Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Timothy Potter
I've been thinking about this one too and was curious about using the Solr
Entity support in the DIH to do the import from one DC to another (for the
lost docs). In my mind, one configures the DIH to use the
SolrEntityProcessor with a query to capture the docs in the DC that stayed
online, most likely using a timestamp in the query (see:
http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor).

Would that work? If so, any downsides? I've only used DIH /
SolrEntityProcessor to populate a staging / dev environment from prod but
have had good success with it.

Thanks.
Tim


On Wed, Aug 28, 2013 at 6:59 AM, Erick Erickson erickerick...@gmail.comwrote:

 The separate DC problem has been lurking for a while. But your
 understanding it a little off. When a replica discovers that
 it's too far out of date, it does an old-style replication. IOW, the
 tlog doesn't contain the entire delta. Eventually, the old-style
 replications catch up to close enough and _then_ the remaining
 docs in the tlog are replayed. The target number of updates in the
 tlog is 100 so it's a pretty small window that's actually replayed in
 the normal case.

 None of which helps your problem. The simplest way (and on the
 expectation that DC outages were pretty rare!) would be to have your
 indexing process fire the missed updates at the DC after it came
 back up.

 Copying from one DC to another is tricky. You'd have to be very,
 very sure that you copied indexes to the right shard. Ditto for any
 process that tried to have, say, a single node from the recovering
 DC temporarily join the good DC, at least long enough to synch.

 Not a pretty problem, we don't really have any best practices yet
 that I know of.

 FWIW,
 Erick


 On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.com
 wrote:

  We have 2 separate data centers in our organisation, and in order to
  maintain the ZK quorum during any DC outage, we have 2 separate Solr
  clouds, one in each DC with separate ZK ensembles but both are fed with
 the
  same indexing data.
 
  Now in the event of a DC outage, all our Solr instances go down, and when
  they come back up, we need some way to recover the lost data.
 
  Our thought was to replicate from the working DC, but is there a way to
 do
  that whilst still maintaining an online presence for indexing purposes?
 
  In essence, we want to do what happens within Solr cloud's recovery, so
 (as
  I understand cloud recovery) a node starts up, (I'm assuming worst case
 and
  peer sync has failed) then buffers all updates into the transaction log,
  replicates from the leader, and replays the transaction log to get
  everything in sync.
 
  Is it conceivable to do the same by extending Solr, so on the activation
 of
  some handler (user triggered), we initiated a replicate from other DC,
  which puts all the leaders into buffering updates, replicate from some
  other set of servers and then replay?
 
  Our goal is to try to minimize the downtime (beyond the initial outage),
 so
  we would ideally like to be able to start up indexing before this
  replicate/clone has finished, that's why I thought to enable buffering
 on
  the transaction log.  Searches shouldn't be sent here, but if they do we
  have a valid (albeit old) index to serve those until the new one swaps
 in.
 
  Just curious how any other DC-aware setups handle this kind of scenario?
   Or other concerns, issues with this type of approach.
 



Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Erick Erickson
If you can satisfy this statement then it seems possible. This is the same
restirction
as atomic updates.:
The SolrEntityProcessor can only copy fields that are stored in the source
index.


On Wed, Aug 28, 2013 at 9:41 AM, Timothy Potter thelabd...@gmail.comwrote:

 I've been thinking about this one too and was curious about using the Solr
 Entity support in the DIH to do the import from one DC to another (for the
 lost docs). In my mind, one configures the DIH to use the
 SolrEntityProcessor with a query to capture the docs in the DC that stayed
 online, most likely using a timestamp in the query (see:
 http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor).

 Would that work? If so, any downsides? I've only used DIH /
 SolrEntityProcessor to populate a staging / dev environment from prod but
 have had good success with it.

 Thanks.
 Tim


 On Wed, Aug 28, 2013 at 6:59 AM, Erick Erickson erickerick...@gmail.com
 wrote:

  The separate DC problem has been lurking for a while. But your
  understanding it a little off. When a replica discovers that
  it's too far out of date, it does an old-style replication. IOW, the
  tlog doesn't contain the entire delta. Eventually, the old-style
  replications catch up to close enough and _then_ the remaining
  docs in the tlog are replayed. The target number of updates in the
  tlog is 100 so it's a pretty small window that's actually replayed in
  the normal case.
 
  None of which helps your problem. The simplest way (and on the
  expectation that DC outages were pretty rare!) would be to have your
  indexing process fire the missed updates at the DC after it came
  back up.
 
  Copying from one DC to another is tricky. You'd have to be very,
  very sure that you copied indexes to the right shard. Ditto for any
  process that tried to have, say, a single node from the recovering
  DC temporarily join the good DC, at least long enough to synch.
 
  Not a pretty problem, we don't really have any best practices yet
  that I know of.
 
  FWIW,
  Erick
 
 
  On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.com
  wrote:
 
   We have 2 separate data centers in our organisation, and in order to
   maintain the ZK quorum during any DC outage, we have 2 separate Solr
   clouds, one in each DC with separate ZK ensembles but both are fed with
  the
   same indexing data.
  
   Now in the event of a DC outage, all our Solr instances go down, and
 when
   they come back up, we need some way to recover the lost data.
  
   Our thought was to replicate from the working DC, but is there a way to
  do
   that whilst still maintaining an online presence for indexing
 purposes?
  
   In essence, we want to do what happens within Solr cloud's recovery, so
  (as
   I understand cloud recovery) a node starts up, (I'm assuming worst case
  and
   peer sync has failed) then buffers all updates into the transaction
 log,
   replicates from the leader, and replays the transaction log to get
   everything in sync.
  
   Is it conceivable to do the same by extending Solr, so on the
 activation
  of
   some handler (user triggered), we initiated a replicate from other
 DC,
   which puts all the leaders into buffering updates, replicate from some
   other set of servers and then replay?
  
   Our goal is to try to minimize the downtime (beyond the initial
 outage),
  so
   we would ideally like to be able to start up indexing before this
   replicate/clone has finished, that's why I thought to enable
 buffering
  on
   the transaction log.  Searches shouldn't be sent here, but if they do
 we
   have a valid (albeit old) index to serve those until the new one swaps
  in.
  
   Just curious how any other DC-aware setups handle this kind of
 scenario?
Or other concerns, issues with this type of approach.
  
 



RE: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Markus Jelsma
Hi - You're going to miss unstored but indexed fields.  We stop any indexing 
process, kill the servlets on the down DC and copy over the files using scp, 
then remove the lock file and start it up again. Always works but it's a manual 
process at this point but should be easy to automate using some simple bash 
scripting.

-Original message-
 From:Timothy Potter thelabd...@gmail.com
 Sent: Wednesday 28th August 2013 15:41
 To: solr-user@lucene.apache.org
 Subject: Re: Data Centre recovery/replication, does this seem plausible?
 
 I've been thinking about this one too and was curious about using the Solr
 Entity support in the DIH to do the import from one DC to another (for the
 lost docs). In my mind, one configures the DIH to use the
 SolrEntityProcessor with a query to capture the docs in the DC that stayed
 online, most likely using a timestamp in the query (see:
 http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor).
 
 Would that work? If so, any downsides? I've only used DIH /
 SolrEntityProcessor to populate a staging / dev environment from prod but
 have had good success with it.
 
 Thanks.
 Tim
 
 
 On Wed, Aug 28, 2013 at 6:59 AM, Erick Erickson 
 erickerick...@gmail.comwrote:
 
  The separate DC problem has been lurking for a while. But your
  understanding it a little off. When a replica discovers that
  it's too far out of date, it does an old-style replication. IOW, the
  tlog doesn't contain the entire delta. Eventually, the old-style
  replications catch up to close enough and _then_ the remaining
  docs in the tlog are replayed. The target number of updates in the
  tlog is 100 so it's a pretty small window that's actually replayed in
  the normal case.
 
  None of which helps your problem. The simplest way (and on the
  expectation that DC outages were pretty rare!) would be to have your
  indexing process fire the missed updates at the DC after it came
  back up.
 
  Copying from one DC to another is tricky. You'd have to be very,
  very sure that you copied indexes to the right shard. Ditto for any
  process that tried to have, say, a single node from the recovering
  DC temporarily join the good DC, at least long enough to synch.
 
  Not a pretty problem, we don't really have any best practices yet
  that I know of.
 
  FWIW,
  Erick
 
 
  On Wed, Aug 28, 2013 at 8:13 AM, Daniel Collins danwcoll...@gmail.com
  wrote:
 
   We have 2 separate data centers in our organisation, and in order to
   maintain the ZK quorum during any DC outage, we have 2 separate Solr
   clouds, one in each DC with separate ZK ensembles but both are fed with
  the
   same indexing data.
  
   Now in the event of a DC outage, all our Solr instances go down, and when
   they come back up, we need some way to recover the lost data.
  
   Our thought was to replicate from the working DC, but is there a way to
  do
   that whilst still maintaining an online presence for indexing purposes?
  
   In essence, we want to do what happens within Solr cloud's recovery, so
  (as
   I understand cloud recovery) a node starts up, (I'm assuming worst case
  and
   peer sync has failed) then buffers all updates into the transaction log,
   replicates from the leader, and replays the transaction log to get
   everything in sync.
  
   Is it conceivable to do the same by extending Solr, so on the activation
  of
   some handler (user triggered), we initiated a replicate from other DC,
   which puts all the leaders into buffering updates, replicate from some
   other set of servers and then replay?
  
   Our goal is to try to minimize the downtime (beyond the initial outage),
  so
   we would ideally like to be able to start up indexing before this
   replicate/clone has finished, that's why I thought to enable buffering
  on
   the transaction log.  Searches shouldn't be sent here, but if they do we
   have a valid (albeit old) index to serve those until the new one swaps
  in.
  
   Just curious how any other DC-aware setups handle this kind of scenario?
Or other concerns, issues with this type of approach.
  
 
 


Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Shawn Heisey
On 8/28/2013 6:13 AM, Daniel Collins wrote:
 We have 2 separate data centers in our organisation, and in order to
 maintain the ZK quorum during any DC outage, we have 2 separate Solr
 clouds, one in each DC with separate ZK ensembles but both are fed with the
 same indexing data.
 
 Now in the event of a DC outage, all our Solr instances go down, and when
 they come back up, we need some way to recover the lost data.
 
 Our thought was to replicate from the working DC, but is there a way to do
 that whilst still maintaining an online presence for indexing purposes?

One way which would work (if your core name structures were identical
between the two clouds) would be to shut down your indexing process,
shut down the cloud that went down and has now come back up, and rsync
from the good cloud.  Depending on the index size, that could take a
long time, and the index updates would be turned off while it's
happening.  That makes this idea less than ideal.

I have a similar setup on a sharded index that's NOT using SolrCloud,
and both copies are in one location instead of two separate data
centers.  My general indexing method would work for your setup, though.

The way that I handle this is that my indexing program tracks its update
position for each copy of the index independently.  If one copy is down,
the tracked position for that index won't get updated, so the next time
it comes up, all missed updates will get done for that copy.  In the
meantime, the program (Java, using SolrJ) is happily using a separate
thread to continue updating the index copy that's still up.

Thanks,
Shawn



Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Daniel Collins
Thanks Shawn/Erick for the suggestions. Unfortunately stopping indexing
whilst we recover isn't a viable option, we are using Solr as an NRT search
platform, so indexing must continue at least on the DC that is fine.  If we
could stop indexing on the broken DC, then recovery is relatively
straightforward, its a rsync/copy of a snapshot from the other data center
followed by restarting indexing.

The million dollar question is how to start up our existing Solr instances
(once the data center has recovered from whatever broke it), realize that
we have a gap in indexing (using a checkpointing mechanism similar to what
Shawn describes), and recover from that (that's the tricky bit!), without
having to interrupt indexing...  I know that replication takes up to an
hour (its a rather large collection but split into 8 shards currently, and
we can replicate each shard in parallel).  What ideally I would like to do
is at the point that I kick off recovery, divert the indexing feed for the
broken into a transaction log on those machines, run the replication and
swap the index in, then replay the transaction log to bring it all up to
date.  That process (conceptually)  is the same as the
org.apache.solr.cloud.RecoveryStrategy code.

Yes, if I could divert that feed a that application level, then I can do
what you suggest, but it feels like more work to do that (and build an
external transaction log) whereas the code seems to already be in Solr
itself, I just need to hook it all up (famous last words!) Our indexing
pipeline does a lot of pre-processing work (its not just pulling data from
a database), and since we are only talking about the time taken to do the
replication (should be an hour or less), it feels like we ought to be able
to store that in a Solr transaction log (i.e. the last point in the
indexing pipeline).

The plan would be to recover the leaders (1 of each shard) this way, and
then use conventional replication/recovery to deal with the local replicas
(blank their data area and then they will automatically sync from the local
leader).


On 28 August 2013 15:26, Shawn Heisey s...@elyograg.org wrote:

 On 8/28/2013 6:13 AM, Daniel Collins wrote:
  We have 2 separate data centers in our organisation, and in order to
  maintain the ZK quorum during any DC outage, we have 2 separate Solr
  clouds, one in each DC with separate ZK ensembles but both are fed with
 the
  same indexing data.
 
  Now in the event of a DC outage, all our Solr instances go down, and when
  they come back up, we need some way to recover the lost data.
 
  Our thought was to replicate from the working DC, but is there a way to
 do
  that whilst still maintaining an online presence for indexing purposes?

 One way which would work (if your core name structures were identical
 between the two clouds) would be to shut down your indexing process,
 shut down the cloud that went down and has now come back up, and rsync
 from the good cloud.  Depending on the index size, that could take a
 long time, and the index updates would be turned off while it's
 happening.  That makes this idea less than ideal.

 I have a similar setup on a sharded index that's NOT using SolrCloud,
 and both copies are in one location instead of two separate data
 centers.  My general indexing method would work for your setup, though.

 The way that I handle this is that my indexing program tracks its update
 position for each copy of the index independently.  If one copy is down,
 the tracked position for that index won't get updated, so the next time
 it comes up, all missed updates will get done for that copy.  In the
 meantime, the program (Java, using SolrJ) is happily using a separate
 thread to continue updating the index copy that's still up.

 Thanks,
 Shawn




Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Shawn Heisey

On 8/28/2013 10:48 AM, Daniel Collins wrote:

What ideally I would like to do
is at the point that I kick off recovery, divert the indexing feed for the
broken into a transaction log on those machines, run the replication and
swap the index in, then replay the transaction log to bring it all up to
date.  That process (conceptually)  is the same as the
org.apache.solr.cloud.RecoveryStrategy code.


I don't think any such mechanism exists currently.  It would be 
extremely awesome if it did.  If there's not an existing Jira issue, I 
recommend that you file one.  Being able to set up a multi-datacenter 
cloud with automatic recovery would be awesome.  Even if it took a long 
time, having it be fully automated would be exceptionally useful.



Yes, if I could divert that feed a that application level, then I can do
what you suggest, but it feels like more work to do that (and build an
external transaction log) whereas the code seems to already be in Solr
itself, I just need to hook it all up (famous last words!) Our indexing
pipeline does a lot of pre-processing work (its not just pulling data from
a database), and since we are only talking about the time taken to do the
replication (should be an hour or less), it feels like we ought to be able
to store that in a Solr transaction log (i.e. the last point in the
indexing pipeline).


I think it would have to be a separate transaction log.  One problem 
with really big regular tlogs is that when Solr gets restarted, the 
entire transaction log that's currently on the disk gets replayed.  If 
it were big enough to recover the last several hours to a duplicate 
cloud, it would take forever to replay on Solr restart.  If the regular 
tlog were kept small but a second log with the last 24 hours were 
available, it could replay updates when the second cloud came back up.


I do import from a database, so the application-level tracking works 
really well for me.


Thanks,
Shawn