Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-10 Thread Otis Gospodnetic
Hi,



- Original Message 
> From: Jake Luciani 
> To: solr-user@lucene.apache.org
> Sent: Wed, March 9, 2011 8:07:00 PM
> Subject: Re: True master-master fail-over without data gaps (choosing CA in 
>CAP)
> 
> Yeah sure.  Let me update this on the Solandra wiki. I'll send across  the
> link

Excellent.  You could include ES there, too, if you feel extra adventurous. ;)

> I think you hit the main two shortcomings  atm.

- Grandma, why are your eyes so big? 
- To see you better.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


> -Jake
> 
> On Wed, Mar 9, 2011 at 6:17 PM, Otis Gospodnetic  >  wrote:
> 
> > Jake,
> >
> > Maybe it's time to come up with the  Solandra/Solr matrix so we can see
> > Solandra's strengths (e.g. RT, no  replication) and weaknesses (e.g. I think
> > I
> > saw a mention of  some big indices?) or missing feature (e.g. no delete by
> > query),  etc.
> >
> > Thanks!
> > Otis
> > 
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
> >
> >
> > - Original  Message ----
> > > From: Jake Luciani 
> > > To: "solr-user@lucene.apache.org"  
> >  > Sent: Wed, March 9, 2011 6:04:13 PM
> > > Subject: Re: True  master-master fail-over without data gaps (choosing CA
> > in
> >  >CAP)
> > >
> > > Jason,
> > >
> > > It's  predecessor did, Lucandra. But Solandra is a new approach  that
> >  manages
> > >shards of documents across the cluster for you and uses  solrs  distributed
> > >search to query indexes.
> >  >
> > >
> > > Jake
> > >
> > > On Mar 9, 2011, at  5:15  PM, Jason Rutherglen <
> > jason.rutherg...@gmail.com>
> >  >wrote:
> > >
> > > > Doesn't Solandra partition by term  instead of  document?
> > > >
> > > > On Wed, Mar 9,  2011 at 2:13 PM, Smiley, David W.  
> > wrote:
> >  > >> I  was just about to jump in this conversation to mention  Solandra 
and
> > go
> > >fig,  Solandra's committer comes in.  :-)   It was nice to meet you at
> > Strata,
> > >Jake.
> >  > >>
> > > >> I haven't dug into the code yet but  Solandra  strikes me as a killer
> > way to
> > >scale Solr. I'm  looking forward to playing with  it; particularly looking
> >  at
> > >disk requirements and performance  measurements.
> >  > >>
> > > >> ~ David Smiley
> > > >>
> >  > >>  On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:
> > >  >>
> > > >>> Hi  Otis,
> > >  >>>
> > > >>> Have you considered using Solandra  with  Quorum writes
> > > >>> to achieve master/master with  CA  semantics?
> > > >>>
> > > >>>  -Jake
> > > >>>
> > > >>>
> > >  >>> On Wed, Mar 9, 2011 at 2:48 PM, Otis  Gospodnetic
> >  > >  > >>>>  wrote:
> > > >>>
> > >  >>>> Hi,
> > > >>>>
> > > >>>>   Original Message 
> > > >>>>
> > >  >>>>> From: Robert Petersen 
> > >  >>>>>
> > > >>>>> Can't you skip the SAN  and keep the indexes  locally?  Then you
> >  would
> >  > >>>>> have two redundant  copies of the index and no  lock issues.
> > > >>>>
> > > >>>>  I  could, but then I'd have the issue of keeping them in sync, which
> >  >seems
> > > >>>> more
> > > >>>>  fragile.  I think SAN  makes things simpler overall.
> > >  >>>>
> > > >>>>> Also,  Can't master02  just be a slave to master01 (in the master
> > farm
> > >and
> >  > >>>>> separate from the slave farm) until such time as   master01 fails?
> > Then
> > > >>>>
> > >  >>>> No, because  it wouldn't be in sync.  It would always  be N minutes
> > >behind,
> > > >>>> and
> > >  >>>> when the primary master  fails, the secondary would not  have all 
the
> > docs
> > >-
> > > >>>>   data
> > > >>>> loss.
>

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jake Luciani
Yeah sure.  Let me update this on the Solandra wiki. I'll send across the
link

I think you hit the main two shortcomings atm.

-Jake

On Wed, Mar 9, 2011 at 6:17 PM, Otis Gospodnetic  wrote:

> Jake,
>
> Maybe it's time to come up with the Solandra/Solr matrix so we can see
> Solandra's strengths (e.g. RT, no replication) and weaknesses (e.g. I think
> I
> saw a mention of some big indices?) or missing feature (e.g. no delete by
> query), etc.
>
> Thanks!
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message 
> > From: Jake Luciani 
> > To: "solr-user@lucene.apache.org" 
> > Sent: Wed, March 9, 2011 6:04:13 PM
> > Subject: Re: True master-master fail-over without data gaps (choosing CA
> in
> >CAP)
> >
> > Jason,
> >
> > It's predecessor did, Lucandra. But Solandra is a new approach  that
> manages
> >shards of documents across the cluster for you and uses solrs  distributed
> >search to query indexes.
> >
> >
> > Jake
> >
> > On Mar 9, 2011, at 5:15  PM, Jason Rutherglen <
> jason.rutherg...@gmail.com>
> >wrote:
> >
> > > Doesn't Solandra partition by term instead of  document?
> > >
> > > On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W.  
> wrote:
> > >> I  was just about to jump in this conversation to mention Solandra and
> go
> >fig,  Solandra's committer comes in. :-)   It was nice to meet you at
> Strata,
> >Jake.
> > >>
> > >> I haven't dug into the code yet but Solandra  strikes me as a killer
> way to
> >scale Solr. I'm looking forward to playing with  it; particularly looking
> at
> >disk requirements and performance  measurements.
> > >>
> > >> ~ David Smiley
> > >>
> > >>  On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:
> > >>
> > >>> Hi  Otis,
> > >>>
> > >>> Have you considered using Solandra with  Quorum writes
> > >>> to achieve master/master with CA  semantics?
> > >>>
> > >>> -Jake
> > >>>
> > >>>
> > >>> On Wed, Mar 9, 2011 at 2:48 PM, Otis  Gospodnetic
> > > >>>>  wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>>  Original Message 
> > >>>>
> > >>>>> From: Robert Petersen 
> > >>>>>
> > >>>>> Can't you skip the SAN and keep the indexes  locally?  Then you
>  would
> > >>>>> have two redundant  copies of the index and no lock issues.
> > >>>>
> > >>>>  I could, but then I'd have the issue of keeping them in sync, which
> >seems
> > >>>> more
> > >>>> fragile.  I think SAN  makes things simpler overall.
> > >>>>
> > >>>>> Also,  Can't master02 just be a slave to master01 (in the master
> farm
> >and
> > >>>>> separate from the slave farm) until such time as  master01 fails?
> Then
> > >>>>
> > >>>> No, because  it wouldn't be in sync.  It would always be N minutes
> >behind,
> > >>>> and
> > >>>> when the primary master  fails, the secondary would not have all the
> docs
> >-
> > >>>>  data
> > >>>> loss.
> > >>>>
> > >>>>>  master02 would start receiving the new documents with an   indexes
> > >>>>> complete up to the last replication at least and  the other slaves
> would
> > >>>>> be directed by LB to poll  master02 also...
> > >>>>
> > >>>> Yeah, "complete up to  the last replication" is the problem.  It's a
> data
> > >>>>  gap
> > >>>> that now needs to be filled somehow.
> > >>>>
> > >>>> Otis
> > >>>> 
> > >>>> Sematext  :: http://sematext.com/ ::  Solr - Lucene - Nutch
> > >>>> Lucene ecosystem search :: http://search-lucene.com/
> > >>>>
> > >>>>
> > >>>>> -Original   Message-
> > >>>>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> > >>>>>  Sent: Wednesday, March 09, 2011 9:47 AM
> > &

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Otis Gospodnetic
Jake,

Maybe it's time to come up with the Solandra/Solr matrix so we can see 
Solandra's strengths (e.g. RT, no replication) and weaknesses (e.g. I think I 
saw a mention of some big indices?) or missing feature (e.g. no delete by 
query), etc.

Thanks!
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
> From: Jake Luciani 
> To: "solr-user@lucene.apache.org" 
> Sent: Wed, March 9, 2011 6:04:13 PM
> Subject: Re: True master-master fail-over without data gaps (choosing CA in 
>CAP)
> 
> Jason,
> 
> It's predecessor did, Lucandra. But Solandra is a new approach  that manages 
>shards of documents across the cluster for you and uses solrs  distributed 
>search to query indexes. 
>
> 
> Jake
> 
> On Mar 9, 2011, at 5:15  PM, Jason Rutherglen   
>wrote:
> 
> > Doesn't Solandra partition by term instead of  document?
> > 
> > On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W.   wrote:
> >> I  was just about to jump in this conversation to mention Solandra and go 
>fig,  Solandra's committer comes in. :-)   It was nice to meet you at Strata,  
>Jake.
> >> 
> >> I haven't dug into the code yet but Solandra  strikes me as a killer way 
> >> to 
>scale Solr. I'm looking forward to playing with  it; particularly looking at 
>disk requirements and performance  measurements.
> >> 
> >> ~ David Smiley
> >> 
> >>  On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:
> >> 
> >>> Hi  Otis,
> >>> 
> >>> Have you considered using Solandra with  Quorum writes
> >>> to achieve master/master with CA  semantics?
> >>> 
> >>> -Jake
> >>> 
> >>> 
> >>> On Wed, Mar 9, 2011 at 2:48 PM, Otis  Gospodnetic 
> >>>>  wrote:
> >>> 
> >>>> Hi,
> >>>> 
> >>>>  Original Message 
> >>>> 
> >>>>> From: Robert Petersen 
> >>>>> 
> >>>>> Can't you skip the SAN and keep the indexes  locally?  Then you  would
> >>>>> have two redundant  copies of the index and no lock issues.
> >>>> 
> >>>>  I could, but then I'd have the issue of keeping them in sync, which  
>seems
> >>>> more
> >>>> fragile.  I think SAN  makes things simpler overall.
> >>>> 
> >>>>> Also,  Can't master02 just be a slave to master01 (in the master farm   
>and
> >>>>> separate from the slave farm) until such time as  master01 fails?   Then
> >>>> 
> >>>> No, because  it wouldn't be in sync.  It would always be N minutes  
>behind,
> >>>> and
> >>>> when the primary master  fails, the secondary would not have all the 
> >>>> docs 
>-
> >>>>  data
> >>>> loss.
> >>>> 
> >>>>>  master02 would start receiving the new documents with an   indexes
> >>>>> complete up to the last replication at least and  the other slaves  
would
> >>>>> be directed by LB to poll  master02 also...
> >>>> 
> >>>> Yeah, "complete up to  the last replication" is the problem.  It's a data
> >>>>  gap
> >>>> that now needs to be filled somehow.
> >>>> 
> >>>> Otis
> >>>> 
> >>>> Sematext  :: http://sematext.com/ ::  Solr - Lucene - Nutch
> >>>> Lucene ecosystem search :: http://search-lucene.com/
> >>>> 
> >>>> 
> >>>>> -Original   Message-
> >>>>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> >>>>>  Sent: Wednesday, March 09, 2011 9:47 AM
> >>>>> To: solr-user@lucene.apache.org
> >>>>>  Subject:  Re: True master-master fail-over without data gaps (choosing 
> >>>>>  
>CA
> >>>>> in  CAP)
> >>>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> 
> >>>>> - Original Message 
> >>>>>>  From: Walter  Underwood 
> >>>>> 
> >>>>>> On  Mar 9, 2011, at 9:02 AM, Otis Gospodnetic  wrote:
> >>>>>> 
> >>>>>>> You  mean  it's  not possible to have 2 masters that are in  nearly
> >>>>> real-time
> >

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jake Luciani
Jason,

It's predecessor did, Lucandra. But Solandra is a new approach that manages 
shards of documents across the cluster for you and uses solrs distributed 
search to query indexes. 

Jake

On Mar 9, 2011, at 5:15 PM, Jason Rutherglen  wrote:

> Doesn't Solandra partition by term instead of document?
> 
> On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W.  wrote:
>> I was just about to jump in this conversation to mention Solandra and go 
>> fig, Solandra's committer comes in. :-)   It was nice to meet you at Strata, 
>> Jake.
>> 
>> I haven't dug into the code yet but Solandra strikes me as a killer way to 
>> scale Solr. I'm looking forward to playing with it; particularly looking at 
>> disk requirements and performance measurements.
>> 
>> ~ David Smiley
>> 
>> On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:
>> 
>>> Hi Otis,
>>> 
>>> Have you considered using Solandra with Quorum writes
>>> to achieve master/master with CA semantics?
>>> 
>>> -Jake
>>> 
>>> 
>>> On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic >>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>>  Original Message 
>>>> 
>>>>> From: Robert Petersen 
>>>>> 
>>>>> Can't you skip the SAN and keep the indexes locally?  Then you  would
>>>>> have two redundant copies of the index and no lock issues.
>>>> 
>>>> I could, but then I'd have the issue of keeping them in sync, which seems
>>>> more
>>>> fragile.  I think SAN makes things simpler overall.
>>>> 
>>>>> Also, Can't master02 just be a slave to master01 (in the master farm  and
>>>>> separate from the slave farm) until such time as master01 fails?   Then
>>>> 
>>>> No, because it wouldn't be in sync.  It would always be N minutes behind,
>>>> and
>>>> when the primary master fails, the secondary would not have all the docs -
>>>> data
>>>> loss.
>>>> 
>>>>> master02 would start receiving the new documents with an  indexes
>>>>> complete up to the last replication at least and the other slaves  would
>>>>> be directed by LB to poll master02 also...
>>>> 
>>>> Yeah, "complete up to the last replication" is the problem.  It's a data
>>>> gap
>>>> that now needs to be filled somehow.
>>>> 
>>>> Otis
>>>> 
>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>>> Lucene ecosystem search :: http://search-lucene.com/
>>>> 
>>>> 
>>>>> -Original  Message-
>>>>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
>>>>> Sent: Wednesday, March 09, 2011 9:47 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject:  Re: True master-master fail-over without data gaps (choosing CA
>>>>> in  CAP)
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> 
>>>>> - Original Message 
>>>>>> From: Walter  Underwood 
>>>>> 
>>>>>> On  Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
>>>>>> 
>>>>>>> You mean  it's  not possible to have 2 masters that are in nearly
>>>>> real-time
>>>>>> sync?
>>>>>>> How  about with DRBD?  I know people use  DRBD to keep 2 Hadoop NNs
>>>>> (their
>>>>>> edit
>>>>>> 
>>>>>>> logs) in  sync to avoid the current NN SPOF, for example, so I'm
>>>>> thinking
>>>>>> this
>>>>>> 
>>>>>>> could be doable with Solr masters, too, no?
>>>>>> 
>>>>>> If you add fault-tolerant, you run into the CAP  Theorem.  Consistency,
>>>>> 
>>>>>> availability, partition: choose two. You cannot have  it  all.
>>>>> 
>>>>> Right, so I'll take Consistency and Availability, and I'll  put my 2
>>>>> masters in
>>>>> the same rack (which has redundant switches, power  supply, etc.) and
>>>>> thus
>>>>> minimize/avoid partitioning.
>>>>> Assuming the above  actually works, I think my Q remains:
>>>>> 
>>>>> How do you set up 2 Solr masters so  they are in n

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jason Rutherglen
Doesn't Solandra partition by term instead of document?

On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W.  wrote:
> I was just about to jump in this conversation to mention Solandra and go fig, 
> Solandra's committer comes in. :-)   It was nice to meet you at Strata, Jake.
>
> I haven't dug into the code yet but Solandra strikes me as a killer way to 
> scale Solr. I'm looking forward to playing with it; particularly looking at 
> disk requirements and performance measurements.
>
> ~ David Smiley
>
> On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:
>
>> Hi Otis,
>>
>> Have you considered using Solandra with Quorum writes
>> to achieve master/master with CA semantics?
>>
>> -Jake
>>
>>
>> On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic >> wrote:
>>
>>> Hi,
>>>
>>>  Original Message 
>>>
>>>> From: Robert Petersen 
>>>>
>>>> Can't you skip the SAN and keep the indexes locally?  Then you  would
>>>> have two redundant copies of the index and no lock issues.
>>>
>>> I could, but then I'd have the issue of keeping them in sync, which seems
>>> more
>>> fragile.  I think SAN makes things simpler overall.
>>>
>>>> Also, Can't master02 just be a slave to master01 (in the master farm  and
>>>> separate from the slave farm) until such time as master01 fails?   Then
>>>
>>> No, because it wouldn't be in sync.  It would always be N minutes behind,
>>> and
>>> when the primary master fails, the secondary would not have all the docs -
>>> data
>>> loss.
>>>
>>>> master02 would start receiving the new documents with an  indexes
>>>> complete up to the last replication at least and the other slaves  would
>>>> be directed by LB to poll master02 also...
>>>
>>> Yeah, "complete up to the last replication" is the problem.  It's a data
>>> gap
>>> that now needs to be filled somehow.
>>>
>>> Otis
>>> 
>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> Lucene ecosystem search :: http://search-lucene.com/
>>>
>>>
>>>> -Original  Message-
>>>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
>>>> Sent: Wednesday, March 09, 2011 9:47 AM
>>>> To: solr-user@lucene.apache.org
>>>> Subject:  Re: True master-master fail-over without data gaps (choosing CA
>>>> in  CAP)
>>>>
>>>> Hi,
>>>>
>>>>
>>>> - Original Message 
>>>>> From: Walter  Underwood 
>>>>
>>>>> On  Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
>>>>>
>>>>>> You mean  it's  not possible to have 2 masters that are in nearly
>>>> real-time
>>>>> sync?
>>>>>> How  about with DRBD?  I know people use  DRBD to keep 2 Hadoop NNs
>>>> (their
>>>>> edit
>>>>>
>>>>>> logs) in  sync to avoid the current NN SPOF, for example, so I'm
>>>> thinking
>>>>> this
>>>>>
>>>>>> could be doable with Solr masters, too, no?
>>>>>
>>>>> If you add fault-tolerant, you run into the CAP  Theorem.  Consistency,
>>>>
>>>>> availability, partition: choose two. You cannot have  it  all.
>>>>
>>>> Right, so I'll take Consistency and Availability, and I'll  put my 2
>>>> masters in
>>>> the same rack (which has redundant switches, power  supply, etc.) and
>>>> thus
>>>> minimize/avoid partitioning.
>>>> Assuming the above  actually works, I think my Q remains:
>>>>
>>>> How do you set up 2 Solr masters so  they are in near real-time sync?
>>>> DRBD?
>>>>
>>>> But here is maybe a simpler  scenario that more people may be
>>>> considering:
>>>>
>>>> Imagine 2 masters on 2  different servers in 1 rack, pointing to the same
>>>> index
>>>> on the shared  storage (SAN) that also happens to live in the same rack.
>>>> 2 Solr masters are  behind 1 LB VIP that indexer talks to.
>>>> The VIP is configured so that all  requests always get routed to the
>>>> primary
>>>> master (because only 1 master  can be modifying an index at a time),
>>>> except when

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Smiley, David W.
I was just about to jump in this conversation to mention Solandra and go fig, 
Solandra's committer comes in. :-)   It was nice to meet you at Strata, Jake.

I haven't dug into the code yet but Solandra strikes me as a killer way to 
scale Solr. I'm looking forward to playing with it; particularly looking at 
disk requirements and performance measurements.

~ David Smiley

On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:

> Hi Otis,
> 
> Have you considered using Solandra with Quorum writes
> to achieve master/master with CA semantics?
> 
> -Jake
> 
> 
> On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic > wrote:
> 
>> Hi,
>> 
>>  Original Message 
>> 
>>> From: Robert Petersen 
>>> 
>>> Can't you skip the SAN and keep the indexes locally?  Then you  would
>>> have two redundant copies of the index and no lock issues.
>> 
>> I could, but then I'd have the issue of keeping them in sync, which seems
>> more
>> fragile.  I think SAN makes things simpler overall.
>> 
>>> Also, Can't master02 just be a slave to master01 (in the master farm  and
>>> separate from the slave farm) until such time as master01 fails?   Then
>> 
>> No, because it wouldn't be in sync.  It would always be N minutes behind,
>> and
>> when the primary master fails, the secondary would not have all the docs -
>> data
>> loss.
>> 
>>> master02 would start receiving the new documents with an  indexes
>>> complete up to the last replication at least and the other slaves  would
>>> be directed by LB to poll master02 also...
>> 
>> Yeah, "complete up to the last replication" is the problem.  It's a data
>> gap
>> that now needs to be filled somehow.
>> 
>> Otis
>> 
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> Lucene ecosystem search :: http://search-lucene.com/
>> 
>> 
>>> -Original  Message-
>>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
>>> Sent: Wednesday, March 09, 2011 9:47 AM
>>> To: solr-user@lucene.apache.org
>>> Subject:  Re: True master-master fail-over without data gaps (choosing CA
>>> in  CAP)
>>> 
>>> Hi,
>>> 
>>> 
>>> - Original Message 
>>>> From: Walter  Underwood 
>>> 
>>>> On  Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
>>>> 
>>>>> You mean  it's  not possible to have 2 masters that are in nearly
>>> real-time
>>>> sync?
>>>>> How  about with DRBD?  I know people use  DRBD to keep 2 Hadoop NNs
>>> (their
>>>> edit
>>>> 
>>>>> logs) in  sync to avoid the current NN SPOF, for example, so I'm
>>> thinking
>>>> this
>>>> 
>>>>> could be doable with Solr masters, too, no?
>>>> 
>>>> If you add fault-tolerant, you run into the CAP  Theorem.  Consistency,
>>> 
>>>> availability, partition: choose two. You cannot have  it  all.
>>> 
>>> Right, so I'll take Consistency and Availability, and I'll  put my 2
>>> masters in
>>> the same rack (which has redundant switches, power  supply, etc.) and
>>> thus
>>> minimize/avoid partitioning.
>>> Assuming the above  actually works, I think my Q remains:
>>> 
>>> How do you set up 2 Solr masters so  they are in near real-time sync?
>>> DRBD?
>>> 
>>> But here is maybe a simpler  scenario that more people may be
>>> considering:
>>> 
>>> Imagine 2 masters on 2  different servers in 1 rack, pointing to the same
>>> index
>>> on the shared  storage (SAN) that also happens to live in the same rack.
>>> 2 Solr masters are  behind 1 LB VIP that indexer talks to.
>>> The VIP is configured so that all  requests always get routed to the
>>> primary
>>> master (because only 1 master  can be modifying an index at a time),
>>> except when
>>> this primary is down,  in which case the requests are sent to the
>>> secondary
>>> master.
>>> 
>>> So in  this case my Q is around automation of this, around Lucene index
>>> locks,
>>> around the need for manual intervention, and such.
>>> Concretely, if you  have these 2 master instances, the primary master has
>>> the
>>> Lucene index  lock in the index dir.  When the secondary master needs to
>>> take
>>> over  (i.e., when i

Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Jake Luciani
Hi Otis,

Have you considered using Solandra with Quorum writes
to achieve master/master with CA semantics?

-Jake


On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic  wrote:

> Hi,
>
>  Original Message 
>
> > From: Robert Petersen 
> >
> > Can't you skip the SAN and keep the indexes locally?  Then you  would
> > have two redundant copies of the index and no lock issues.
>
> I could, but then I'd have the issue of keeping them in sync, which seems
> more
> fragile.  I think SAN makes things simpler overall.
>
> > Also, Can't master02 just be a slave to master01 (in the master farm  and
> > separate from the slave farm) until such time as master01 fails?   Then
>
> No, because it wouldn't be in sync.  It would always be N minutes behind,
> and
> when the primary master fails, the secondary would not have all the docs -
> data
> loss.
>
> > master02 would start receiving the new documents with an  indexes
> > complete up to the last replication at least and the other slaves  would
> > be directed by LB to poll master02 also...
>
> Yeah, "complete up to the last replication" is the problem.  It's a data
> gap
> that now needs to be filled somehow.
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
> > -Original  Message-----
> > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> > Sent: Wednesday, March 09, 2011 9:47 AM
> > To: solr-user@lucene.apache.org
> > Subject:  Re: True master-master fail-over without data gaps (choosing CA
> > in  CAP)
> >
> > Hi,
> >
> >
> > - Original Message 
> > > From: Walter  Underwood 
> >
> > > On  Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
> > >
> > > > You mean  it's  not possible to have 2 masters that are in nearly
> > real-time
> > >sync?
> > > > How  about with DRBD?  I know people use  DRBD to keep 2 Hadoop NNs
> > (their
> > >edit
> > >
> > > > logs) in  sync to avoid the current NN SPOF, for example, so I'm
> > thinking
> > >this
> > >
> > > > could be doable with Solr masters, too, no?
> > >
> > > If you add fault-tolerant, you run into the CAP  Theorem.  Consistency,
> >
> > >availability, partition: choose two. You cannot have  it  all.
> >
> > Right, so I'll take Consistency and Availability, and I'll  put my 2
> > masters in
> > the same rack (which has redundant switches, power  supply, etc.) and
> > thus
> > minimize/avoid partitioning.
> > Assuming the above  actually works, I think my Q remains:
> >
> > How do you set up 2 Solr masters so  they are in near real-time sync?
> > DRBD?
> >
> > But here is maybe a simpler  scenario that more people may be
> > considering:
> >
> > Imagine 2 masters on 2  different servers in 1 rack, pointing to the same
> > index
> > on the shared  storage (SAN) that also happens to live in the same rack.
> > 2 Solr masters are  behind 1 LB VIP that indexer talks to.
> > The VIP is configured so that all  requests always get routed to the
> > primary
> > master (because only 1 master  can be modifying an index at a time),
> > except when
> > this primary is down,  in which case the requests are sent to the
> > secondary
> > master.
> >
> > So in  this case my Q is around automation of this, around Lucene index
> > locks,
> > around the need for manual intervention, and such.
> > Concretely, if you  have these 2 master instances, the primary master has
> > the
> > Lucene index  lock in the index dir.  When the secondary master needs to
> > take
> > over  (i.e., when it starts receiving documents via LB), it needs to be
> > able to
> > write to that same index.  But what if that lock is still around?   One
> > could use
> > the Native lock to make the lock disappear if the primary  master's JVM
> > exited
> > unexpectedly, and in that case everything *should*  work and be
> > completely
> > transparent, right?  That is, the secondary  will start getting new docs,
> > it will
> > use its IndexWriter to write to that  same shared index, which won't be
> > locked
> > for writes because the lock is  gone, and everyone will be happy.  Did I
> > miss
> > something important  here?
> >
> > Assuming the above is correct, what if the lock is *not* gone  because
> > the
> > primary master's JVM is actually not dead, although maybe  unresponsive,
> > so LB
> > thinks the primary master is dead.  Then the LB  will route indexing
> > requests to
> > the secondary master, which will attempt  to write to the index, but be
> > denied
> > because of the lock.  So a  human needs to jump in, remove the lock, and
> > manually
> > reindex failed docs  if the upstream component doesn't buffer docs that
> > failed to
> > get indexed  and doesn't retry indexing them automatically.  Is this
> > correct or
> > is there a way to avoid humans  here?
> >
> > Thanks,
> > Otis
> > 
> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> > Lucene ecosystem search :: http://search-lucene.com/
> >
>



-- 
http://twitter.com/tjake


Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Otis Gospodnetic
Hi,

 Original Message 

> From: Robert Petersen 
>
> Can't you skip the SAN and keep the indexes locally?  Then you  would
> have two redundant copies of the index and no lock issues.  

I could, but then I'd have the issue of keeping them in sync, which seems more 
fragile.  I think SAN makes things simpler overall.
 
> Also, Can't master02 just be a slave to master01 (in the master farm  and
> separate from the slave farm) until such time as master01 fails?   Then

No, because it wouldn't be in sync.  It would always be N minutes behind, and 
when the primary master fails, the secondary would not have all the docs - data 
loss.

> master02 would start receiving the new documents with an  indexes
> complete up to the last replication at least and the other slaves  would
> be directed by LB to poll master02 also...

Yeah, "complete up to the last replication" is the problem.  It's a data gap 
that now needs to be filled somehow.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


> -Original  Message-
> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
> Sent: Wednesday, March 09, 2011 9:47 AM
> To: solr-user@lucene.apache.org
> Subject:  Re: True master-master fail-over without data gaps (choosing CA
> in  CAP)
> 
> Hi,
> 
> 
> - Original Message 
> > From: Walter  Underwood 
> 
> > On  Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
> > 
> > > You mean  it's  not possible to have 2 masters that are in nearly
> real-time 
> >sync?
> > > How  about with DRBD?  I know people use  DRBD to keep 2 Hadoop NNs
> (their 
> >edit 
> >
> > > logs) in  sync to avoid the current NN SPOF, for example, so I'm
> thinking 
> >this 
> >
> > > could be doable with Solr masters, too, no?
> > 
> > If you add fault-tolerant, you run into the CAP  Theorem.  Consistency,
> 
> >availability, partition: choose two. You cannot have  it  all.
> 
> Right, so I'll take Consistency and Availability, and I'll  put my 2
> masters in 
> the same rack (which has redundant switches, power  supply, etc.) and
> thus 
> minimize/avoid partitioning.
> Assuming the above  actually works, I think my Q remains:
> 
> How do you set up 2 Solr masters so  they are in near real-time sync?
> DRBD?
> 
> But here is maybe a simpler  scenario that more people may be
> considering:
> 
> Imagine 2 masters on 2  different servers in 1 rack, pointing to the same
> index 
> on the shared  storage (SAN) that also happens to live in the same rack.
> 2 Solr masters are  behind 1 LB VIP that indexer talks to.
> The VIP is configured so that all  requests always get routed to the
> primary 
> master (because only 1 master  can be modifying an index at a time),
> except when 
> this primary is down,  in which case the requests are sent to the
> secondary 
> master.
> 
> So in  this case my Q is around automation of this, around Lucene index
> locks, 
> around the need for manual intervention, and such.
> Concretely, if you  have these 2 master instances, the primary master has
> the 
> Lucene index  lock in the index dir.  When the secondary master needs to
> take 
> over  (i.e., when it starts receiving documents via LB), it needs to be
> able to 
> write to that same index.  But what if that lock is still around?   One
> could use 
> the Native lock to make the lock disappear if the primary  master's JVM
> exited 
> unexpectedly, and in that case everything *should*  work and be
> completely 
> transparent, right?  That is, the secondary  will start getting new docs,
> it will 
> use its IndexWriter to write to that  same shared index, which won't be
> locked 
> for writes because the lock is  gone, and everyone will be happy.  Did I
> miss 
> something important  here?
> 
> Assuming the above is correct, what if the lock is *not* gone  because
> the 
> primary master's JVM is actually not dead, although maybe  unresponsive,
> so LB 
> thinks the primary master is dead.  Then the LB  will route indexing
> requests to 
> the secondary master, which will attempt  to write to the index, but be
> denied 
> because of the lock.  So a  human needs to jump in, remove the lock, and
> manually 
> reindex failed docs  if the upstream component doesn't buffer docs that
> failed to 
> get indexed  and doesn't retry indexing them automatically.  Is this
> correct or 
> is there a way to avoid humans  here?
> 
> Thanks,
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 


RE: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Robert Petersen
Can't you skip the SAN and keep the indexes locally?  Then you would
have two redundant copies of the index and no lock issues.  

Also, Can't master02 just be a slave to master01 (in the master farm and
separate from the slave farm) until such time as master01 fails?  Then
master02 would start receiving the new documents with an indexes
complete up to the last replication at least and the other slaves would
be directed by LB to poll master02 also...

-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
Sent: Wednesday, March 09, 2011 9:47 AM
To: solr-user@lucene.apache.org
Subject: Re: True master-master fail-over without data gaps (choosing CA
in CAP)

Hi,

 
- Original Message 
> From: Walter Underwood 

> On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
> 
> > You mean it's  not possible to have 2 masters that are in nearly
real-time 
>sync?
> > How  about with DRBD?  I know people use DRBD to keep 2 Hadoop NNs
(their 
>edit 
>
> > logs) in sync to avoid the current NN SPOF, for example, so I'm
thinking 
>this 
>
> > could be doable with Solr masters, too, no?
> 
> If you add fault-tolerant, you run into the CAP  Theorem. Consistency,

>availability, partition: choose two. You cannot have it  all.

Right, so I'll take Consistency and Availability, and I'll put my 2
masters in 
the same rack (which has redundant switches, power supply, etc.) and
thus 
minimize/avoid partitioning.
Assuming the above actually works, I think my Q remains:

How do you set up 2 Solr masters so they are in near real-time sync?
DRBD?

But here is maybe a simpler scenario that more people may be
considering:

Imagine 2 masters on 2 different servers in 1 rack, pointing to the same
index 
on the shared storage (SAN) that also happens to live in the same rack.
2 Solr masters are behind 1 LB VIP that indexer talks to.
The VIP is configured so that all requests always get routed to the
primary 
master (because only 1 master can be modifying an index at a time),
except when 
this primary is down, in which case the requests are sent to the
secondary 
master.

So in this case my Q is around automation of this, around Lucene index
locks, 
around the need for manual intervention, and such.
Concretely, if you have these 2 master instances, the primary master has
the 
Lucene index lock in the index dir.  When the secondary master needs to
take 
over (i.e., when it starts receiving documents via LB), it needs to be
able to 
write to that same index.  But what if that lock is still around?  One
could use 
the Native lock to make the lock disappear if the primary master's JVM
exited 
unexpectedly, and in that case everything *should* work and be
completely 
transparent, right?  That is, the secondary will start getting new docs,
it will 
use its IndexWriter to write to that same shared index, which won't be
locked 
for writes because the lock is gone, and everyone will be happy.  Did I
miss 
something important here?

Assuming the above is correct, what if the lock is *not* gone because
the 
primary master's JVM is actually not dead, although maybe unresponsive,
so LB 
thinks the primary master is dead.  Then the LB will route indexing
requests to 
the secondary master, which will attempt to write to the index, but be
denied 
because of the lock.  So a human needs to jump in, remove the lock, and
manually 
reindex failed docs if the upstream component doesn't buffer docs that
failed to 
get indexed and doesn't retry indexing them automatically.  Is this
correct or 
is there a way to avoid humans here?

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


Re: True master-master fail-over without data gaps (choosing CA in CAP)

2011-03-09 Thread Otis Gospodnetic
Hi,

 
- Original Message 
> From: Walter Underwood 

> On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
> 
> > You mean it's  not possible to have 2 masters that are in nearly real-time 
>sync?
> > How  about with DRBD?  I know people use DRBD to keep 2 Hadoop NNs (their 
>edit 
>
> > logs) in sync to avoid the current NN SPOF, for example, so I'm  thinking 
>this 
>
> > could be doable with Solr masters, too, no?
> 
> If you add fault-tolerant, you run into the CAP  Theorem. Consistency, 
>availability, partition: choose two. You cannot have it  all.

Right, so I'll take Consistency and Availability, and I'll put my 2 masters in 
the same rack (which has redundant switches, power supply, etc.) and thus 
minimize/avoid partitioning.
Assuming the above actually works, I think my Q remains:

How do you set up 2 Solr masters so they are in near real-time sync?  DRBD?

But here is maybe a simpler scenario that more people may be considering:

Imagine 2 masters on 2 different servers in 1 rack, pointing to the same index 
on the shared storage (SAN) that also happens to live in the same rack.
2 Solr masters are behind 1 LB VIP that indexer talks to.
The VIP is configured so that all requests always get routed to the primary 
master (because only 1 master can be modifying an index at a time), except when 
this primary is down, in which case the requests are sent to the secondary 
master.

So in this case my Q is around automation of this, around Lucene index locks, 
around the need for manual intervention, and such.
Concretely, if you have these 2 master instances, the primary master has the 
Lucene index lock in the index dir.  When the secondary master needs to take 
over (i.e., when it starts receiving documents via LB), it needs to be able to 
write to that same index.  But what if that lock is still around?  One could 
use 
the Native lock to make the lock disappear if the primary master's JVM exited 
unexpectedly, and in that case everything *should* work and be completely 
transparent, right?  That is, the secondary will start getting new docs, it 
will 
use its IndexWriter to write to that same shared index, which won't be locked 
for writes because the lock is gone, and everyone will be happy.  Did I miss 
something important here?

Assuming the above is correct, what if the lock is *not* gone because the 
primary master's JVM is actually not dead, although maybe unresponsive, so LB 
thinks the primary master is dead.  Then the LB will route indexing requests to 
the secondary master, which will attempt to write to the index, but be denied 
because of the lock.  So a human needs to jump in, remove the lock, and 
manually 
reindex failed docs if the upstream component doesn't buffer docs that failed 
to 
get indexed and doesn't retry indexing them automatically.  Is this correct or 
is there a way to avoid humans here?

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/