Re: Query performance for large query results

Marcel Reutegger Mon, 27 Nov 2006 03:27:46 -0800

Jukka Zitting wrote:

On 11/27/06, Christoph Kiehl <[EMAIL PROTECTED]> wrote:
1. Use a lazy QueryResultImpl that keeps a reference to the result andonly
fetches the UUIDs for requested nodes.
I much prefer this approach over adding yet another cache. :-)

I would prefer a solution that fetches a configurable amount of result nodes andif more results are requested re-executes the query to get the remaining nodes.

This imposes that the access check is done in the QueryResultImpl and the
result size returned by size() may vary if you don't have access tosome nodes
(which it already does if node in the result gets deleted).


We could postpone the size calculation; if getSize() is never called,
there is no need to calculate the result in advance. Additionally or
instead of making getSize() lazy, we could add a configuration
variable that governs the accuracy of the return value:

1) Return -1, this is allowed by the spec, but not very useful
2) Return the (almost) correct size like now, but with the latency issue
3) Return the unfiltered size, reducing latency but compromising security

4) Return the correct size for result sets of up to N nodes, otherwisereturn -1


I would go for 3) combined with 4):
Return the correct size up to N, otherwise N + remaining-unfiltered-size()

The real problem is how to trigger result.close() which closes theindex. I'm evennot sure if it causes problems if indexes are not closed as fast aspossible.
Any Lucene experts around with more insight on this?

well, it doesn't have an effect on other queries or the internals of the queryhandler, it will simply keep resources for a unknown time. but anyway, there isstill the question when exactly the result would be closed.

I would rather prefer the already mentioned approach where the query isre-executed to get more results.


regards
 marcel

Re: Query performance for large query results

Reply via email to