Another potential idea would be to break up the index into N indices such that each index is small enough to fit two in memory and then you can swap them. http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/MultiReader.html This is just an idea, I haven't tried it.
My current solution for the warm up problem is that I use "cat /path/to/index/* > /dev/null" to force the new index into the disk cache. Kind of lame but this works for getting a new index into disk cache if you don't have enough RAM to fit two indices in RAM for the swap. I'll also mention that I remove the box from the load balancer while this disk cache warming is occurring. M On Wed, Jun 10, 2009 at 9:17 AM, Diamond, Greg<gdiam...@harryfox.com> wrote: > Thanks for the responses. I am testing it out using MMapDirectory. > > Cheers! > > -----Original Message----- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Wednesday, June 10, 2009 6:36 AM > To: java-user@lucene.apache.org > Subject: RE: Reloading RAM Directory from updated FS Directory > > > There is currently a patch/idea from Earwin around that modifies > MMapDirectory to optionally call MappedByteBuffer.load() after mapping a > file from the directory. MappedByteBuffer.load() tells the operating system > kernel to try to swap as much as possible from the file into physical RAM. > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -----Original Message----- >> From: eks dev [mailto:eks...@yahoo.co.uk] >> Sent: Wednesday, June 10, 2009 12:30 PM >> To: java-user@lucene.apache.org >> Subject: Re: Reloading RAM Directory from updated FS Directory >> >> >> there is one case where MMAP does not beat RAM, initial warm-up after >> process restart. With MMAP it can take a while before you get up to speed. >> MMAP with reopen is the best, if you run without restart. >> >> >> >> >> >> ----- Original Message ---- >> > From: Uwe Schindler <u...@thetaphi.de> >> > To: java-user@lucene.apache.org >> > Sent: Wednesday, 10 June, 2009 0:24:35 >> > Subject: RE: Reloading RAM Directory from updated FS Directory >> > >> > Hi Greg & Kay, >> > >> > > Lucene (IndexSearchers / IndexReaders) has the notion of cache as well >> > > so you need to check if you really want a 100% replication of >> > > RAMDirectory / FSDirectory as well concurrently in memory. Have you >> > > tested with the FieldCache policies before moving onto the >> RAMDirectory >> > > based backup solution. >> > >> > As you have so big indexes, I think you use a 64 bit machine and 64 bit >> JVM. >> > I would recommend you in this case to use MMapDirectory instead of >> copying >> > the contents to a RAMDirectory. MMapDirectory works like FSDirectory, >> but it >> > does not read the files via normal RandomAccessFile actions, but it maps >> the >> > file into the 64 bit address space. In principle this is like a swap >> file, >> > the contents of the file are seen like memory. The operating system >> kernel >> > then uses the same strategies to map parts of the index into real RAM, >> just >> > like a swap file. For Lucene it looks like a RAMDirectory backed by a >> file. >> > >> > Changes to the underlying index can be seen immediately, and changes can >> > reloaded using IndexReader.reopen(), which is much faster than reopening >> the >> > index in complete. >> > >> > Kay Kay: Instantiating an IndexSearcher on top of a IndexReader costs >> > nothing (its just a wrapper around the IndexReader). The costly part is >> > opening the IndexReader. So using reopen() and wrapping this always with >> a >> > fresh IndexReader is faster than creating an IndexSearcher using the >> > directory path. >> > >> > Uwe >> > >> > > Diamond, Greg wrote: >> > > > Hi All - >> > > > >> > > > What is the best way to load a RAM Directory from a FS Directory, >> and >> > > periodically reload the RAM Directory to pick up new documents? >> > > > >> > > > The scenario I have is I create several large directories which I >> create >> > > to a file system, then load them into ram for faster searching. >> > > > They takes several hours to create, so i want to retain a file copy >> so >> > > in the event of a service/server crash or reboot, they can load again >> > > > in a few seconds. Once a day I append or recreate the FS >> Directories >> > > with new entries, and then reload the RAM Directories to pick up the >> > > > new entries. >> > > > >> > > > Problem is, what I am doing seems to cause a memory leak. The >> > > directories take ~12 GB of RAM to load, but bloats >> > > > to 24 GB (the -Xmx setting) after a few reloads and stays there. >> > > > >> > > > // Map to hold single instance of the IndexSearcher, 1 per RAM >> > > Directory, to be reused across requests. >> > > > private static final Mapsearchers = new >> > > HashMap(); >> > > > >> > > > // Creates the IndexSearcher and stores it in the static map. >> > > > public static void load(IndexName index, ...) { >> > > > // sync... etc >> > > > RAMDirectory dir = new >> > > RAMDirectory(FSDirectory.getDirectory(directoryPath); >> > > > IndexReader reader = IndexReader.open(dir, true); >> > > > IndexSearcher searcher = new IndexSearcher(reader); >> > > > searchers.put(index, searcher); >> > > > } >> > > > >> > > > public static IndexSearcher get(IndexName index, ...) { >> > > > // sync... etc >> > > > return searchers.get(index); >> > > > } >> > > > >> > > > // After an indexing service appends the existing FS Directory, >> this >> > > reloads it. >> > > > public static void reload(IndexName index, ...) { >> > > > // sync... etc >> > > > IndexSearcher searcher = searchers.get(index); >> > > > FSDirectory fsDirectory = >> > > FSDirectory.getDirectory(directoryPath); >> > > > Directory ramDirectory = >> > > searcher.getIndexReader().directory(); >> > > > Directory.copy(fsDirectory, ramDirectory, false); >> > > > } >> > > > >> > > > I've tries some theme and variations on this with the same issue. >> > > > >> > > > TIA! >> > > > >> > > > gd >> > > > >> > > > >> > > > -------------------------------------------------------------------- >> - >> > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > > > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > > >> > > > >> > > > >> > > >> > > >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org