On Wed, Jun 10, 2009 at 7:23 PM, Jason
Rutherglen<jason.rutherg...@gmail.com> wrote:
> Cool! Sounds like with LUCENE-1458 we can experiment with some
> of these things. Does CSF become just another codec?

I believe LUCENE-1458 currently only makes terms dict & postings
pluggable...

>> I'm leary of having terms dict live entirely on disk, though
> we should certainly explore it.
>
> Yeah, it should theoretically help with reloading, it could use
> a skiplist (as we have a disk version of that implemented)
> instead of binarysearch). It seems like with things like
> TrieRange (which potentially adds many fields and terms) it
> could be useful to let the IO cache calculate what we need in
> RAM and what we don't, otherwise we're constantly at risk of
> exceeding heap usage. There'll be other potential RAM issues
> (such as page faults), but it seems like users will constantly
> be up against the inability to precalculate Java heap usage of
> data structures (whereas file based data usage can be measured).
> Norms are another example, and with flexible indexing (and
> scoring?) there may be additional fields the user may want to
> change dynamically, that if completely loaded into heap cause
> OOM problems.
>
> I guess I personally think it would be great to not worry about
> exceeding heap with Lucene apps (as it's a guessing game), and
> then one can simply analyze the OS level IO cache/swap space to
> see if the app could slow down due to the machine not having
> enough RAM. I think this would remove one of the major
> differences between a Java based search engine and a C++ based
> one.

Marvin and I discussed this quite a bit already in LUCENE-1458... we
should make it pluggable and then try both -- let the machine tell
us ;)

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to