On Tue, Jan 12, 2010 at 8:15 PM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote:
> John, you should have a look at Zoie. I just finished adding LinkedIn's > case study about Zoie to Lucene in Action 2, so this is fresh in my mind. :) > Yep, Zoie ( http://zoie.googlecode.com ) will handle the server restart part, in that while yes, you lose what is in RAM, Zoie keeps track of an "index version" on disk alongside the Lucene index which it uses to decide where it must reindex from to "catch up" if it there have been incoming indexing events while the server was out of commission. Zoie does not support multiple servers using the same index, because each zoie instance has IndexWriter instances, and you'll get locking problems trying to do that. You could have one Zoie instance effectively as the "master/writer/realtime reader", and a bunch of raw Lucene "slaves" which could read off of that index, but as you say, could not get access to the RAMDirectory information until it was flushed to disk. Why do you need a "cluster" of servers hitting the same index? Are they different applications (with different search logic, so they need to be different instances), or is it just to try and utilize your hardware efficiently? If it's for performance reasons, you might find you get better use of your CPU cores by just sharding your one index into smaller ones, each having their own Zoie instance, and putting a "broker" on top of them searching across all and mergesorting the results. Often even this isn't necessary, because Zoie will be opening the disk-backed IndexReader in readonly mode, and thus all the synchronized blocks are gone, and one single Zoie instance will easily saturate your cpu cores by simple multi-threading by your appserver. If you really needed to do many different kinds of writes (from different applications) and also have applications not involved in the writing also seeing (in real-time) these writes, then you could still do it with Zoie, but it would take some interesting architectural juggling (write your own StreamDataProvider class which takes input from a variety of sources and merges them together to feed to one Zoie instance, then a broker on top of zoie which serves out IndexReaders to different applications living on top which can wrap them up in their own business logic as they saw fit... as long as it was ok to have all the applications in the same JVM, of course). -jake > > Otis > -- > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch > > > > ----- Original Message ---- > > From: jchang <jchangkihat...@gmail.com> > > To: java-dev@lucene.apache.org > > Sent: Tue, January 12, 2010 6:10:56 PM > > Subject: Lucene 2.9.0 Near Real Time Indexing and Service > Crashes/restarts > > > > > > Lucene 2.9.0 has near real time indexing, writing to a RAMDir which gets > > flushed to disk when you do a search. > > > > Does anybody know how this works out with service restarts (both orderly > > shutdown and a crash)? If the service goes down while indexed items are > in > > RAMDir but not on disk, are they lost? Or is there some kind of log > > recovery? > > > > Also, does anybody know the impact of this which clustered lucene > servers? > > If you have numerous servers running off one index, I assume there is no > way > > for the other services to pick up the newly indexed items until they are > > flushed to disk, correct? I'd be happy if that is not so, but I suspect > it > > is so. > > > > Thanks, > > John > > -- > > View this message in context: > > > http://old.nabble.com/Lucene-2.9.0-Near-Real-Time-Indexing-and-Service-Crashes-restarts-tp27136539p27136539.html > > Sent from the Lucene - Java Developer mailing list archive at Nabble.com. > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > >