On Wed, May 21, 2014 at 8:20 AM, Ravikumar Govindarajan <ravikumar.govindara...@gmail.com> wrote: > Great blog and lucid explanation > > I think things have changed in recent kernel versions. I am no expert, but > could see some code related to this here > http://lxr.free-electrons.com/source/mm/fadvise.c?v=3.14
That looks promising. But does that mean SEQUENTIAL will evict the page once we're done reading it? > O_DIRECT will be terrible drag no? Actually O_DIRECT is awesome because it completely bypasses the buffer cache, so nothing will be evicted. The downside is you must do your own buffering/read-ahead into userspace RAM, so you need to be more careful about heap used... Also, Linus hates this option :) > Will a battery-backed disk cache help here? This will make IndexWriter.commit faster, since the IO device will be able to return from fsync before bytes are actually moved to stable storage. But you really shouldn't need to call commit so frequently, in which case a faster commit is not so important. > We are using a SortingMergePolicy which most-often hits data randomly. Will > SEQUENTIAL help here? Oh hmm then you should NOT call SEQUENTIAL and should not use O_DIRECT! In fact, you want the IO pages for merging to enter the buffer cache.... > Any reasons why you think DONTNEED will be less-useful? Well, that option is too late? Like, say I read in the N 1 GB files to merge, then I call DONTNEED once the merge is done, but by then the pages for searching have already been evicted. I could instead call WONTNEED every few KB of reads/writes but that seems hackish, like it's a poor emulation of what SEQUENTIAL would express. But net/net there has been good progress lately, new IO APIs in Java, improvements to Linux kernel, etc. There are also sneaky ways to invoke some of these OS-level APIs without using JNI (the JDK has some internal APIs). I think we should explore this area more, to minimize the cost of merging on ongoing searches. Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org