[
https://issues.apache.org/jira/browse/LUCENE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13707419#comment-13707419
]
Paul Elschot commented on LUCENE-5109:
--------------------------------------
I don't expect to have time to continue this until half way September, so I'd
rather leave this here, maybe someone can pick it up.
The index is a value index on the upper bit numbers.
I have called the index a value index, because it would also be possible to add
an index the Elias-Fano index of the unary upper bit numbers. Basically a value
index indexes the zero upper bits, and an index index would index the one upper
bits. (A unary number is a series of zero bits followed by a single one bit,
and the number of zero bits determines the value.)
The patch adds ...ValueIndexed.java files that extend EliasFanoEncoder and
EliasFanoDecoder.
(EliasFanoEncoder is also changed to use longHex from ToStringUtils, just as in
LUCENE-5098 .)
EliasFanoEncoderValueIndexed creates an index by value on the upper bits,
and EliasFanoDecoderValueIndexed uses this index in its advanceToValue method.
Both EliasFanoEncoder EliasFanoDecoder have attributes changed from private to
protected, so they can be used in their subclasses.
The EliasFanoDocIdSet is changed to use the above value indexed versions.
Its testAgainstBitSet has been overriden to be empty (nocommit) because that
test currently fails.
There are no other tests yet.
The main function of the patch is overriding the method advanceToHighValue in
EliasFanoDecoderValueIndexed.
The idea is to advance to the high value just before the actual target from the
index, and then continue as usual.
Unfortunately the value index does not work yet, the testAgainstBitSet fails.
Some tests for the Elias-Fano index itself clearly need to be added first.
Once this index works it is probably better to merge it into EliasFanoEncoder
and EliasFanoDecoder because of the speed up the index is expected to provide.
I prefer to start like this because without index things are working nicely
now, even with the changes from private to protected.
> EliasFano value index
> ---------------------
>
> Key: LUCENE-5109
> URL: https://issues.apache.org/jira/browse/LUCENE-5109
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/other
> Reporter: Paul Elschot
> Assignee: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-5109.patch
>
>
> Index upper bits of Elias-Fano sequence.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]