[jira] [Commented] (LUCENE-6486) DocumentDictionary entry iterator skips items with optional null payload field

Marius Grama (JIRA) Mon, 18 May 2015 13:02:46 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549095#comment-14549095
 ]


Marius Grama commented on LUCENE-6486:
--------------------------------------

bq. Also, I'm nervous about pretending we saw a 0-byte payload when the payload 
was in fact missing, and also the empty set when no contexts were specified: we 
lose information by doing this, e.g. we can no longer distinguish if the 
document did in fact have a 0-byte payload.

NOTE : As in the class FileDictionary, when the dictionary is configured with 
hasPayloads field set to true, there will be an empty value instead of null for 
the field payload when the payload field in the document is null.

[~mikemccand] I would gladly change the behaviour of returning an empty 
BytesRef instead of null, but I find this task rather difficult because there 
are several classes having their logic depending on InputIterator#hasPayloads() 
, InputIterator#hasContexts() methods (SortedInputIterator, 
BufferedInputIterator, AnalyzingSuggester to name a few).
I guess that one solution would be to remove the previously mentioned 
InputIterator methods, but this would mean to rewrite all the classes where 
these methods are used and also to find other ways to encode null values for 
payloads.
e.g. :
SortedInputIterator(line 88-90)
{code:language=java}
        if (hasPayloads) {
          payload = decodePayload(bytes, input);
        }
{code}

SortedInputIterator (line 240-243)
{code:language=java}
    if (hasPayloads) {
      output.writeBytes(payload.bytes, payload.offset, payload.length);
      output.writeShort((short) payload.length);
    }
{code}


Any ideas how to encode null values for the non-existing payload fields so that 
there can we differentiate when reading from persistence whether we're dealing 
with null and not with an empty BytesRef instance? I am thinking of writing -1 
for the length of the payload, but I guess there are better solutions at hand 
than this one.




> DocumentDictionary entry iterator skips items with optional null payload field
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-6486
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6486
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 4.10.3
>            Reporter: Marius Grama
>             Fix For: Trunk, 5.2
>
>         Attachments: LUCENE-6486.patch, LUCENE-6486.patch, LUCENE-6486.patch
>
>
> As denoted in the ticket SOLR-7086 the DocumentDictionary entry iterator 
> shouldn't skip entries from the dictionary having null value for the payload 
> field due to the fact that this field is optional.
> This behaviour causes inconsistencies in the Solr suggester which simply 
> skips valid documents due to the fact that they don't have values for the 
> payload field.
> As agreed with [~mikemccand] I am attaching a patch to this Lucene issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6486) DocumentDictionary entry iterator skips items with optional null payload field

Reply via email to