> > But does that mean SEQUENTIAL will evict the > page once we're done reading it?
Yes, looks like it does evict the pages once read completes... Well, that option is too late? Like, say I read in the N 1 GB files to merge, then I call DONTNEED once the merge is done, but by then the pages for searching have already been evicted Ahh... Thanks for the explanation... Let me elaborate a bit more I have numerous unsorted segments with very less sizes and fewer sorted segments with biggish sizes. Merge-Policy will segregate these 2 The bigger sorted-segments always merge within themselves using SMP & SEQUENTIAL advise. It should be helpful in this case no? Smaller unsorted segments also merge within themselves using SMP. But since the segment-sizes are very less, the effect on buffer-cache must be negligible. I feel there is no need to advise in this case... There are also sneaky ways to > invoke some of these OS-level APIs without using JNI This is cool stuff... Saves an amazing amount of effort for most of the things... -- Ravi On Wed, May 21, 2014 at 7:13 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Wed, May 21, 2014 at 8:20 AM, Ravikumar Govindarajan > <ravikumar.govindara...@gmail.com> wrote: > > Great blog and lucid explanation > > > > I think things have changed in recent kernel versions. I am no expert, > but > > could see some code related to this here > > http://lxr.free-electrons.com/source/mm/fadvise.c?v=3.14 > > That looks promising. But does that mean SEQUENTIAL will evict the > page once we're done reading it? > > > O_DIRECT will be terrible drag no? > > Actually O_DIRECT is awesome because it completely bypasses the buffer > cache, so nothing will be evicted. > > The downside is you must do your own buffering/read-ahead into > userspace RAM, so you need to be more careful about heap used... > > Also, Linus hates this option :) > > > Will a battery-backed disk cache help here? > > This will make IndexWriter.commit faster, since the IO device will be > able to return from fsync before bytes are actually moved to stable > storage. But you really shouldn't need to call commit so frequently, > in which case a faster commit is not so important. > > > We are using a SortingMergePolicy which most-often hits data randomly. > Will > > SEQUENTIAL help here? > > Oh hmm then you should NOT call SEQUENTIAL and should not use > O_DIRECT! In fact, you want the IO pages for merging to enter the > buffer cache.... > > > Any reasons why you think DONTNEED will be less-useful? > > Well, that option is too late? Like, say I read in the N 1 GB files > to merge, then I call DONTNEED once the merge is done, but by then the > pages for searching have already been evicted. I could instead call > WONTNEED every few KB of reads/writes but that seems hackish, like > it's a poor emulation of what SEQUENTIAL would express. > > But net/net there has been good progress lately, new IO APIs in Java, > improvements to Linux kernel, etc. There are also sneaky ways to > invoke some of these OS-level APIs without using JNI (the JDK has some > internal APIs). I think we should explore this area more, to minimize > the cost of merging on ongoing searches. > > Mike McCandless > > http://blog.mikemccandless.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >