Hi Mike,

This is a very belated reply, but I just wanted to say that I really
appreciate your comments -- this has been a very helpful and informative
discussion!  (-:

Thanks,
Chris

On Thu, Jul 23, 2009 at 10:50 AM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Thu, Jul 23, 2009 at 10:03 AM, Nigel<nigelspl...@gmail.com> wrote:
>
> > Mike, the question you raise is whether (or to what degree) the OS will
> swap
> > out app memory in favor of IO cache.  I don't know anything about how the
> > Linux kernel makes those decisions, but I guess I had hoped that
> (regardless
> > of the swappiness setting) it would be less likely to swap out
> application
> > memory for IO, than it would be to replace some cached IO data with some
> > different cached IO data.
>
> I think swappiness is exactly the configuration that tells Linux just
> how happily it should swapp out application memory for IO cache vs
> other IO cache for new IO cache.
>
> > The latter case is what kills Lucene performance
> > when you've got a lot of index data in the IO cache and a file copy or
> some
> > other operation replaces it all with something else: the OS has no way of
> > knowing that some IO cache is more desirable long-term than other IO
> > cache.
>
> I agree that hurts Lucene, but the former also hurts Lucene.  EG if
> the OS swaps out our norms, terms index, deleted docs, field cache,
> then that's gonna hurt search performance.  You hit maybe 10 page faults
> and suddenly you're looking at an unacceptable increase in the search
> latency.
>
> For a dedicated search box (your case) it'd be great to wire these
> pages (or, set swappiness to 0 and make sure you have plenty of RAM,
> which is supposed to do the same thing I believe).
>
> > The former case (swapping app for IO cache) makes sense, I suppose, if
> the
> > app memory hasn't been used in a long time, but with an LRU cache you
> should
> > be hitting those pages pretty frequently by definition.
>
> EG if your terms index is large, I bet many pages will be seen by the
> OS as rarely used.  We do a binary search through it... so the upper
> levels of that binary search tree are frequently hit, but the lower
> levels will be much less frequently hit.  I can see the OS happily
> swapping out big chunks of the terms dict index.  And it's quite costly
> because we don't have good locality in how we access it (except
> towards the very end of the binary search).
>
> > But if it does swap out your Java cache for something else, you're
> > probably no worse off than before, right?  In this case you have to
> > hit the disk to fault in the paged-out cache; in the original case
> > you have to hit the disk to read the index data that's not in IO
> > cache.
>
> Hard to say... if it swaps out the postings, since we tend to access
> them sequentially, we have good locality and so swapping back in
> should be faster (I *think*).  I guess norms, field cache and deleted
> docs also have good locality.  Though... I'm actually not sure how
> effectively VM systems take advantage of locality when page faults are
> hit.
>
> > Anyway, the interaction between these things (virtual memory, IO cache,
> > disk, JVM, garbage collection, etc.) are complex and so the optimal
> > configuration is very usage-dependent.  The current Lucene behavior seems
> to
> > be the most flexible.  When/if I get a chance to try the Java caching for
> > our situation I'll report the results.
>
> I think the biggest low-hanging-fruit in this area would be an
> optional JNI-based extension to Lucene that'd allow merging to tell
> the OS *not* to cache the bytes that we are reading, and to optimizing
> those file descriptors for sequential access (eg do aggressive
> readahead).  It's a nightmare that a big segment merge can evict not
> only IO cache but also (with the default swappiness on most Linux
> distros) evict our in-RAM caches too!
>
> Mike
>

Reply via email to