Re: True master-master fail-over without data gaps

Jonathan Rochkind Wed, 09 Mar 2011 09:42:16 -0800

On 3/9/2011 12:05 PM, Otis Gospodnetic wrote:

But check this! In some cases one is not allowed to save content todisk (think
copyrights).  I'm not making this up - we actually have a customer with this
"cannot save to disk" (but can index) requirement.

Do they realize that a Solr index is on disk, and if you save it to aSolr index it's being saved to disk? If they prohibited you fromputting the doc in a stored field in Solr, I guess that would at leastbe somewhat consistent, although annoying.

But I don't think it's our customers jobs to tell us HOW to implementour software to get the results they want. They can certainly make youpromise not to distribute or use copyrighted material, and they can evenask to see your security procedures to make sure it doesn't get out.But if you need to buffer documents to achieve the application theywant, but they won't let you... Solr can't help you with that.

As I suggested before though, I might rather buffer to a NoSQL storelike MongoDB or CouchDB instead of actually to disk. Perhaps yourcustomer won't notice those stores keep data on disk just like theyhaven't noticed Solr does. I am not an expert in various kinds of NoSQLstores, but I think some of them in fact specialize in the area ofconcern here: Absolute failover reliability through replication.


Solr is not a store.

So buffering to disk is not an option, and buffering in memory is not practical
because of the input document rate and their size.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

From: Otis Gospodnetic [otis_gospodne...@yahoo.com]
Sent:  Tuesday, March 08, 2011 11:45 PM
To: solr-user@lucene.apache.org
Subject:  True master-master fail-over without data gaps

Hello,

What are  some common or good ways to handle indexing (master) fail-over?
Imagine you  have a continuous stream of incoming documents that you have to
index without  losing any of them (or with losing as few of them as possible).
How do you  set up you masters?
In other words, you can't just have 2 masters where the  secondary is the
Repeater (or Slave) of the primary master and replicates the  index
periodically:
you need to have 2 masters that are in sync at all  times!
How do you achieve that?

* Do you just put N masters behind a  LB VIP, configure them both to point to
the
index on some shared storage  (e.g. SAN), and count on the LB to fail-over to
the
secondary master when the  primary becomes unreachable?
If so, how do you deal with index locks?   You use the Native lock and count

on

it disappearing when the primary master  goes down?  That means you count on
the
whole JVM process dying, which  may not be the case...

* Or do you use tools like DRBD, Corosync,  Pacemaker, etc. to keep 2 masters
with 2 separate indices in sync, while  making sure you write to only 1 of

them

via LB VIP or otherwise?

* Or  ...


This thread is on a similar topic, but is inconclusive:
   http://search-lucene.com/m/aOsyN15f1qd1

Here is another similar  thread, but this one doesn't cover how 2 masters are
kept in sync at all  times:
   http://search-lucene.com/m/aOsyN15f1qd1

Thanks,
Otis
----
Sematext  :: http://sematext.com/ ::  Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/

Re: True master-master fail-over without data gaps

Reply via email to