Adrien Grand updated LUCENE-7489:
    Attachment: LUCENE-7489.patch

Here is a new patch, I think it's ready.

Compared to the current Lucene54DocValuesFormat, the only missing feature is 
table compression for the multi-valued case. I did not implement it because I 
was thinking that maybe we want to spend our complexity budget on other types 
of compression now that we have an iterator API. But if we decide otherwise, 
there is nothing that prevents us from adding it back in the future.

Another difference is that nothing uses the old packedints APIs anymore 
(PackedInts.Reader, MonotonicBlockPackedReader, etc.), the reverse terms index 
is off heap, and the terms dictionary implementation always uses the compressed 
impl rather than trying to figure out whether storing fixed-length terms would 
be more efficient.

Something that is still left to do is to see whether we can improve things by 
making the packedints API return iterators when only sequential access is 
needed. But that belongs to a different issue IMO, this change is huge already.

> Improve sparsity support of Lucene70DocValuesFormat
> ---------------------------------------------------
>                 Key: LUCENE-7489
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7489
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7489.patch, LUCENE-7489.patch
> Like Lucene70NormsFormat, it should be able to only encode actual values.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to