Michael McCandless created LUCENE-7457:

             Summary: Default doc values format should optimize for iterator 
                 Key: LUCENE-7457
                 URL: https://issues.apache.org/jira/browse/LUCENE-7457
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael McCandless
            Priority: Blocker
             Fix For: master (7.0)

In LUCENE-7407 we switched doc values consumption from random access API to an 
iterator API, but nothing was done there to improve the codec.  We should do 
that here.

At a bare minimum we should fix the existing very-sparse case to be a true 
iterator, and not wrapped with the silly legacy wrappers.

I think we should also increase the threshold (currently 1%?) when we switch 
from dense to sparse encoding.  This should fix LUCENE-7253, making merging of 
sparse doc values efficient ("pay for what you use").

I'm sure there are many other things to explore to let codecs "take advantage" 
of the fact that they no longer need to offer random access to doc values.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to