Hi David,

David Caruana wrote:
From a brief scan of the Jackrabbit implementation, I believe it is
possible to repeatedly call each of the above methods on a given
QueryResult instance and in each case receive a new RowIterator or
NodeIterator.  From reading the JCR spec., I would assume this
behaviour too but it is not explicit.

The spec does not explicitly say that you get a new iterator whenever you call getNodes() but I think it is in line with the general pattern how iterators are used / created. E.g. the javadoc for Collections.iterator() also just says 'Returns an iterator ...' instead of 'Returns a new iterator ...'.

However, from an implementation perspective, doesn't this mean that
either the complete result set has to be kept somewhere
(memory/disk..) or the query re-executed for each call to getRows or
getNodes.  I think JackRabbit holds it in memory; has this been an
issue for large result sets?

Jackrabbit keeps the UUIDs of the result nodes in memory, so far this did not pose any problem, though I only tested it with a couple of thousends of result nodes. The actual Node instances are only created when needed.

I only ask as I'm currently implementing the QueryManager façade onto
our own repository implementation and can go either way - or is the
spec. more flexible? meaning that getRows()/Nodes() can return the
same Iterator, and Query.Execute has to called again to get a new
Iterator.

I personally think that the spec is not flexible in that respect. I would suggest that the query is executed again inside getRows()/Nodes() in case the result is too large. At least that's what I would do in Jackrabbit if the current implementation should hit its limits. If you keep a reference to the Query in QueryResult this should be possible quite easily.

regards
 marcel

Reply via email to