[ 
https://issues.apache.org/jira/browse/LUCENE-5542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5542.
----------------------------------------
    Resolution: Duplicate

Dup of LUCENE-7407.  We now pass a {{DocValuesProducer}} to all the 
{{addXYZField}} when writing doc values.

> Explore making DVConsumer sparse-aware
> --------------------------------------
>
>                 Key: LUCENE-5542
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5542
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/codecs
>            Reporter: Shai Erera
>
> Today DVConsumer API requires the caller to pass a value for every document, 
> where {{null}} means "this doc has no value". The Codec can then choose how 
> to encode the values, i.e. whether it encodes a 0 for a numeric field, or 
> encodes the sparse docs. In practice, from what I see, we choose to encode 
> the 0s.
> I wonder if we e.g. added an {{Iterable<Number>}} to 
> DVConsumer.addXYZField(), if that would make a better API. The caller only 
> passes <doc,value> pairs and it's up to the Codec to decide how it wants to 
> encode the missing values. Like, if a user's app truly has a sparse NDV, 
> IndexWriter doesn't need to "fill the gaps" artificially. It's the job of the 
> Codec.
> To be clear, I don't propose to change any Codec implementation in this issue 
> (w.r.t. sparse encoding - yes/no), only change the API to reflect that 
> sparseness. I think that if we'll ever want to encode sparse values, it will 
> be a more convenient API.
> Thoughts? I volunteer to do this work, but want to get others' opinion before 
> I start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to