[
https://issues.apache.org/jira/browse/SOLR-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Rowe reopened SOLR-4891:
------------------------------
At Hoss's suggestion on #solr IRC last night, I tested whether {{JsonLoader}}
behavior has changed around {{BigInteger}} and {{BigDecimal}} values as a
result of the changes committed under this issue.
I'm reopening to address an issue with adding JSON {{BIGNUMBER}}-s (returned by
the Noggit parser when a number won't fit in either a long or a double) to trie
integer or long fields: a {{NumberFormatException}} is no longer triggered, and
the values are silently corrupted.
Before committing the patch on this issue, {{BigInteger}}-typed values were not
created for {{BIGNUMBER}}-s in {{SolrInputDocument}}; instead, they (along with
every other JSON value) were converted to {{String}}-s, and then adding such a
value to an integer or long field would cause a {{NumberFormatException}} to be
thrown from {{Integer.parseInt()}} or {{Long.parseLong()}}. This was proper
and good.
But now, {{BigInteger}}-typed values are converted (in
{{TrieField.createField()}} to int/long using {{BigInteger}}'s {{intValue()}}
and {{longValue()}} methods, which return only the low-order 32 and 64 bits,
respectively. These values are always corrupted: the truncated high-order bits
are guaranteed to be non-zero, since {{BigInteger}} typing only happens when
values won't fit into 64 bits.
Reverting back to {{String}}-typed {{BIGNUMBER}} values fixes the problem.
By contrast, {{BigDecimal}}'s {{doubleValue()}} and {{floatValue()}} methods
truncate the low-order bits, resulting in loss of precision rather than
corruption. This is the same behavior used by {{Double.parseDouble()}} and
{{Float.parseFloat()}}. Reverting back to {{String}}-typing for decimal
{{BIGNUMBER}}-s in addition to integral {{BIGNUMBER}}-s won't be a problem.
Patch forthcoming.
> JsonLoader should preserve field value types from the JSON content stream
> -------------------------------------------------------------------------
>
> Key: SOLR-4891
> URL: https://issues.apache.org/jira/browse/SOLR-4891
> Project: Solr
> Issue Type: Bug
> Components: update
> Reporter: Steve Rowe
> Assignee: Steve Rowe
> Priority: Minor
> Fix For: 4.4
>
> Attachments: SOLR-4891.patch
>
>
> JSON content streams carry some basic type information for their field
> values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN.
> {{JsonLoader}} should set field value object types in the
> {{SolrInputDocument}} according to the content stream's data types.
> Currently {{JsonLoader}} converts all non-{{String}}-typed field values to
> {{String}}-s.
> There is a comment in {{JsonLoader.parseSingleFieldValue()}}, where the
> convert-everything-to-string logic happens, that says "for legacy reasons,
> single values s are expected to be strings", but other content streams' type
> information is not flattened like this, e.g. {{JavabinLoader}}.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]