[ 
https://issues.apache.org/jira/browse/LUCENE-7457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-7457:
---------------------------------
    Attachment: LUCENE-7457.patch

Here is a patch implementing what Mike describes above as the bare minimum. I'm 
not sure it is worth spending too much time on this since we will probably want 
to build a new DV format that better takes advantage of the iterator-style API 
until 7.0 is released?

> Default doc values format should optimize for iterator access
> -------------------------------------------------------------
>
>                 Key: LUCENE-7457
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7457
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Adrien Grand
>            Priority: Blocker
>             Fix For: master (7.0)
>
>         Attachments: LUCENE-7457.patch
>
>
> In LUCENE-7407 we switched doc values consumption from random access API to 
> an iterator API, but nothing was done there to improve the codec.  We should 
> do that here.
> At a bare minimum we should fix the existing very-sparse case to be a true 
> iterator, and not wrapped with the silly legacy wrappers.
> I think we should also increase the threshold (currently 1%?) when we switch 
> from dense to sparse encoding.  This should fix LUCENE-7253, making merging 
> of sparse doc values efficient ("pay for what you use").
> I'm sure there are many other things to explore to let codecs "take 
> advantage" of the fact that they no longer need to offer random access to doc 
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to