[ 
https://issues.apache.org/jira/browse/LUCENE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774514#comment-13774514
 ] 

Paul Elschot edited comment on LUCENE-5109 at 9/23/13 5:39 PM:
---------------------------------------------------------------

Patch of 23 september: as announced yesterday.

I tried benchmarking with index divisor 128 instead of 256. It is indeed a 
little bit faster for far advanceTo operations.

I used this code snippet in the benchmark to avoid the EliasFanoDocIdSet being 
used when it is not advisable:

{code}
    new DocIdSetFactory() {
      @Override
      public DocIdSet copyOf(FixedBitSet set) throws IOException {
        int numValues = set.cardinality();
        int upperBound = set.prevSetBit(set.length() - 1);
        if (EliasFanoDocIdSet.sufficientlySmallerThanBitSet(numValues, 
upperBound)) {
          final EliasFanoDocIdSet copy = new EliasFanoDocIdSet(numValues, 
upperBound));
          copy.encodeFromDisi(set.iterator());
          return copy;
        } else {
          return set;
        }
      }
    }
{code}

The sufficientlySmallerThanBitSet method currently checks for upperbound/7 > 
numValues.
That used to be a division by 6, I added 1 because the index was added.

Anyway, "advisable" will depend on better benchmarking than I can do...
                
      was (Author: [email protected]):
    Patch of 23 september: as announced yesterday.

I tried benchmarking with index divisor 128 instead of 256. It is indeed a 
little bit faster for far advanceTo operations.

I used this code snippet in the benchmark to avoid the EliasFanoDocIdSet being 
used when it is not advisable:

{code}
    new DocIdSetFactory() {
      @Override
      public DocIdSet copyOf(FixedBitSet set) throws IOException {
        long numValues = set.cardinality();
        long upperBound = set.prevSetBit(set.length() - 1);
        if (EliasFanoDocIdSet.sufficientlySmallerThanBitSet(numValues, 
upperBound)) {
          final EliasFanoDocIdSet copy = new EliasFanoDocIdSet(numValues, 
upperBound));
          copy.encodeFromDisi(set.iterator());
          return copy;
        } else {
          return set;
        }
      }
    }
{code}

The sufficientlySmallerThanBitSet method currently checks for upperbound/7 > 
numValues.
That used to be a division by 6, I added 1 because the index was added.

Anyway, "advisable" will depend on better benchmarking than I can do...
                  
> EliasFano value index
> ---------------------
>
>                 Key: LUCENE-5109
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5109
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>            Reporter: Paul Elschot
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-5109.patch, LUCENE-5109.patch, LUCENE-5109.patch
>
>
> Index upper bits of Elias-Fano sequence.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to