[
https://issues.apache.org/jira/browse/LUCENE-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549095#comment-14549095
]
Marius Grama commented on LUCENE-6486:
--------------------------------------
bq. Also, I'm nervous about pretending we saw a 0-byte payload when the payload
was in fact missing, and also the empty set when no contexts were specified: we
lose information by doing this, e.g. we can no longer distinguish if the
document did in fact have a 0-byte payload.
NOTE : As in the class FileDictionary, when the dictionary is configured with
hasPayloads field set to true, there will be an empty value instead of null for
the field payload when the payload field in the document is null.
[~mikemccand] I would gladly change the behaviour of returning an empty
BytesRef instead of null, but I find this task rather difficult because there
are several classes having their logic depending on InputIterator#hasPayloads()
, InputIterator#hasContexts() methods (SortedInputIterator,
BufferedInputIterator, AnalyzingSuggester to name a few).
I guess that one solution would be to remove the previously mentioned
InputIterator methods, but this would mean to rewrite all the classes where
these methods are used and also to find other ways to encode null values for
payloads.
e.g. :
SortedInputIterator(line 88-90)
{code:language=java}
if (hasPayloads) {
payload = decodePayload(bytes, input);
}
{code}
SortedInputIterator (line 240-243)
{code:language=java}
if (hasPayloads) {
output.writeBytes(payload.bytes, payload.offset, payload.length);
output.writeShort((short) payload.length);
}
{code}
Any ideas how to encode null values for the non-existing payload fields so that
there can we differentiate when reading from persistence whether we're dealing
with null and not with an empty BytesRef instance? I am thinking of writing -1
for the length of the payload, but I guess there are better solutions at hand
than this one.
> DocumentDictionary entry iterator skips items with optional null payload field
> ------------------------------------------------------------------------------
>
> Key: LUCENE-6486
> URL: https://issues.apache.org/jira/browse/LUCENE-6486
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 4.10.3
> Reporter: Marius Grama
> Fix For: Trunk, 5.2
>
> Attachments: LUCENE-6486.patch, LUCENE-6486.patch, LUCENE-6486.patch
>
>
> As denoted in the ticket SOLR-7086 the DocumentDictionary entry iterator
> shouldn't skip entries from the dictionary having null value for the payload
> field due to the fact that this field is optional.
> This behaviour causes inconsistencies in the Solr suggester which simply
> skips valid documents due to the fact that they don't have values for the
> payload field.
> As agreed with [~mikemccand] I am attaching a patch to this Lucene issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]