[
https://issues.apache.org/jira/browse/LUCENE-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082898#comment-14082898
]
David Smiley commented on LUCENE-5156:
--------------------------------------
I can understand why this change was done -- better to not support it than
support something optional that should be implemented fast yet not do it fast.
What if it were to be made fast, along with seekCeil() which is also
implemented slowly right now too? For example, say the first time either
seekCeil is called or an ord method is called, then build up an array of term
start positions by ordinal, which otherwise wouldn't be done. Then you could
do a binary search for seekCeil and a direct lookup for seekExact. The
lazy-created array could also then be shared across repeated invocations to get
Terms for the current document.
Why bother, you might ask? I'm working on a means of having the Terms from
term vectors be directly searched against by the default highlighter instead of
re-inverting to MemoryIndex. I'll post a separate issue for that with code, of
course, which "works" but isn't as efficient as it could be thanks to the O(N)
of seekCeil on term vectors' Terms.
> CompressingTermVectors termsEnum should probably not support seek-by-ord
> ------------------------------------------------------------------------
>
> Key: LUCENE-5156
> URL: https://issues.apache.org/jira/browse/LUCENE-5156
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Fix For: 4.5, 5.0
>
> Attachments: LUCENE-5156.patch
>
>
> Just like term vectors before it, it has a O(n) seek-by-term.
> But this one also advertises a seek-by-ord, only this is also O(n).
> This could cause e.g. checkindex to be very slow, because if termsenum
> supports ord it does a bunch of seeking tests. (Another solution would be to
> leave it, and add a boolean so checkindex never does seeking tests for term
> vectors, only real fields).
> However, I think its also kinda a trap, in my opinion if seek-by-ord is
> supported anywhere, you kinda expect it to be faster than linear time...?
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]