Hi

I read CWF today and initially I thought this is going to cache a Filter
in-memory for me, so that I can more efficiently use it for subsequent
searches. But I learned that all it does is cache the DocIdSet returned by
the wrapped Filter.

This is good in and on itself, but I wonder if we shouldn't go the extra
mile and wrap stuff in memory for Filters which don't operate from memory.
For example - I have a Filter which reads information from a Payload as it's
iterated on, so it doesn't keep anything in memory (it's per-user
information, so I haven't decided yet if I can afford caching it in-memory
and whether it will be beneficial). Caching that sort of Filter by CWF will
obviously not improve anything.

I'm not sure what to do here:
1. Just reflect that in the javadoc (it is very confusing saying "Wraps
another filter's result and caches it", which is not true)
2. Introduce a class which takes a Filter and loads it into memory (I think
I read an issue/discussion about this), to an OpenBitSet for example (but we
need to know the number of results in advance, or grow the array as we go
along).
3. Don't use CWF, write a "load-a-Filter-into-in-memory-Filter" utility, and
cache the Filters w/ the user as Key.

I will probably need to do the second part of (3) anyway, so I'm asking
whether such a utility is useful to exist in Lucene, and perhaps there's
already one (I thought I read somewhere about the ability to execute a Query
and get back a Filter, or use the results as a Filter)? I looked at
QueryWrapperFilter, but it doesn't seem to give me what I need, since its
getDocIdSet method returns an iterator which is the Scorer of the Query that
it wraps.

Anyway, I think the documentation of CWF should be fixed and made clearer.

Any thoughts?

Shai

Reply via email to