When comparing RAMDirectory and FSDirectory it is important to mention what OS you are using. When using linux it will cache the most recent disk access in memory. Here is a good article that describes its strategy: http://forums.gentoo.org/viewtopic.php?t=175419
The 2% difference you are seeing is the memory copy. With other OSes you may see a speed up when using the RAMDirectory, because not all OSes contain a disk cache in memory and must access the disk to read the index. Another consideration is there is currently a 2GB limitation with the size of the RAMDirectory. Indexes over 2GB causes a overflow in the int used to create the buffer. [see int len = (int) is.length(); in RamDirectory] I ended up using RAM directory for a very different reason. The index is 1 to 2MB and is rebuilt every few hours. It takes 3 to 4 minutes to query the database and rebuild the index. But the search should be available 100% of the time. Since the index is so small I do the following: on server startup: - look for semaphore, if it is there delete the index - if there is no index, build it to FSdirectory - load the index from FSDirectory into RAMDirectory on reindex: - create semaphore - rebuild index to FSDirectory - delete semaphore - load index from FSDirecttory into RAMDirectory to search: - search the RAMDirectory RAMDirectory could be replaced by a regular FSDirectory, but it seemed silly to copy the index from disk to disk, when it ultimately needs to be in memory. FSDirectory could be replaced by a RAMDirectory, but this means that it would take the server 3 to 4 minutes longer to startup every time. By persisting the index, this time would only be necessary if indexing was interrupted. Jonathan On Mon, 22 Nov 2004 12:39:07 -0800, Kevin A. Burton <[EMAIL PROTECTED]> wrote: > Otis Gospodnetic wrote: > > >For the Lucene book I wrote some test cases that compare FSDirectory > >and RAMDirectory. What I found was that with certain settings > >FSDirectory was almost as fast as RAMDirectory. Personally, I would > >push FSDirectory and hope that the OS and the Filesystem do their share > >of work and caching for me before looking for ways to optimize my code. > > > > > Yes... I performed the same benchmark and in my situation RAMDirectory > for searches was about 2% slower. > > I'm willing to bet that it has to do with the fact that its a Hashtable > and not a HashMap (which isn't synchronized). > > Also adding a constructor for the term size could make loading a > RAMDirectory faster since you could prevent rehash. > > If you're on a modern machine your filesystme cache will end up > buffering your disk anyway which I'm sure was happening in my situation. > > Kevin > > -- > > Use Rojo (RSS/Atom aggregator). Visit http://rojo.com. Ask me for an > invite! Also see irc.freenode.net #rojo if you want to chat. > > Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html > > If you're interested in RSS, Weblogs, Social Networking, etc... then you > should work for Rojo! If you recommend someone and we hire them you'll > get a free iPod! > > Kevin A. Burton, Location - San Francisco, CA > AIM/YIM - sfburtonator, Web - http://peerfear.org/ > GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]