Michael McCandless (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654057#action_12654057 ]
Michael McCandless commented on LUCENE-831:
-------------------------------------------
One more thing here... while random-access lookup of a field's value
via MultiReader dispatch requires the binary search to find the right
sub-reader, I think in most internal uses, the access could be
switched to an iterator instead, in which case the lookup should be
far faster.
This sounds good.
EG when sorting by field, we could pull say an IntData iterator from
the reader, and then access the int values in docID order as we visit
the docs.
We need random access after collecting/visiting though...do we put what
we collect into a map? If a lot of docs match?
For norms, which we should eventually switch to FieldCache +
column-stride fields, it would be the same story.
Accessing via iterator should go a long ways to reducing the overhead
of "using a method" instead of accessing the full array directly.
I agree...dropping the binary search per id access on a multi reader is
a must (at least for the common sort use case of field cache).
I'm still attempting to consolidate my ideas on this. I've come at it
from a rather lazy approach - Hoss laid out a an API that was 'almost'
complete, and I just completed things and pushed in a couple directions
(the ArrayObject). I've been thinking from the inside out. Originally I
was just playing around and trying to build some interest. I have more
interest in getting this finished now though, and it sounds like others
do as well. This discussion will hopefully get things moving.
- Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]