Hi Erick! Nice to hear from you again! From time to time my interest in these "Lucene things" returns and I do some experiments :p
Just to add to this conversation, I found an interesting link to Mike's blog about memory resident indexes (using another virtual machine) http://blog.mikemccandless.com/2012/07/lucene-index-in-ram-with-azuls-zing-jvm.html and also (which is not exactly what I asked but seems related) there is a Google Summer of Code project to build a memory residen term resident: http://www.google-melange.com/gsoc/project/google/gsoc2013/billybob/42001 Thanks Emmanuel 2013/7/1 Erick Erickson <erickerick...@gmail.com>: > Hey Emma! It's been a while.... > > Building on what Steven said, here's Uwe's blog on > MMapDirectory and Lucene: > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > I've always considered RAMDirectory for rather restricted > use-cases. I.e. if I know without doubt that the index > is both relatively static and bounded. The other use I've > seen is to use it to index single documents on-the-fly for > some reason (say complex processing of a single result) > then throw it out afterwards. > > How are things going? > > Erick > > > > On Fri, Jun 28, 2013 at 5:36 PM, Steven Schlansker <ste...@likeness.com>wrote: > >> >> On Jun 28, 2013, at 2:29 PM, Emmanuel Espina <espinaemman...@gmail.com> >> wrote: >> >> > I'm building a distributed index (mostly as a reasearch project for >> > school) and I'm evaluating indexing the entire collection in memory >> > (like google, facebook and others have done years ago). The obvious >> > reason for this is performance considering that the replication will >> > give me a reasonably good durability of the data (despite being in >> > volatile memory). >> > >> > What is the current status of Lucene for this kind of indexes? >> > RAMDirectory in it's documentation has a scary warning that says that >> > "is not intended to work with huge indexes", and that sounds more like >> > it is an implementation for testing rather than something for >> > production. >> > >> > Of course there is no real context for this question, because it is a >> > reasearch topic. Testing it's limits would be the closest to a context >> > I have :p >> >> You could consider MMapDirectory, which will end up putting the active >> portions >> of the index in memory (via the filesystem buffer cache). >> >> The benefit is that you don't completely destroy the Java heap >> (RAMDirectory causes immense >> GC pressure if you are not careful) and you don't have to commit all of >> your ram to index usage all the time. >> >> The downside is that if your working set exceeds the amount of RAM >> available for buffer cache, you will get silent performance degradation as >> you fall back to disk reads for the missing blocks. >> >> Maybe this is OK for your use case, maybe not. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org