[ 
https://issues.apache.org/jira/browse/SOLR-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe reopened SOLR-4891:
------------------------------


At Hoss's suggestion on #solr IRC last night, I tested whether {{JsonLoader}} 
behavior has changed around {{BigInteger}} and {{BigDecimal}} values as a 
result of the changes committed under this issue.

I'm reopening to address an issue with adding JSON {{BIGNUMBER}}-s (returned by 
the Noggit parser when a number won't fit in either a long or a double) to trie 
integer or long fields: a {{NumberFormatException}} is no longer triggered, and 
the values are silently corrupted.

Before committing the patch on this issue, {{BigInteger}}-typed values were not 
created for {{BIGNUMBER}}-s in {{SolrInputDocument}}; instead, they (along with 
every other JSON value) were converted to {{String}}-s, and then adding such a 
value to an integer or long field would cause a {{NumberFormatException}} to be 
thrown from {{Integer.parseInt()}} or {{Long.parseLong()}}.  This was proper 
and good.

But now, {{BigInteger}}-typed values are converted (in 
{{TrieField.createField()}} to int/long using {{BigInteger}}'s {{intValue()}} 
and {{longValue()}} methods, which return only the low-order 32 and 64 bits, 
respectively.  These values are always corrupted: the truncated high-order bits 
are guaranteed to be non-zero, since {{BigInteger}} typing only happens when 
values won't fit into 64 bits.

Reverting back to {{String}}-typed {{BIGNUMBER}} values fixes the problem.

By contrast, {{BigDecimal}}'s {{doubleValue()}} and {{floatValue()}} methods 
truncate the low-order bits, resulting in loss of precision rather than 
corruption.  This is the same behavior used by {{Double.parseDouble()}} and 
{{Float.parseFloat()}}.  Reverting back to {{String}}-typing for decimal 
{{BIGNUMBER}}-s in addition to integral {{BIGNUMBER}}-s won't be a problem.

Patch forthcoming.
                
> JsonLoader should preserve field value types from the JSON content stream
> -------------------------------------------------------------------------
>
>                 Key: SOLR-4891
>                 URL: https://issues.apache.org/jira/browse/SOLR-4891
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>            Reporter: Steve Rowe
>            Assignee: Steve Rowe
>            Priority: Minor
>             Fix For: 4.4
>
>         Attachments: SOLR-4891.patch
>
>
> JSON content streams carry some basic type information for their field 
> values, as parsed by Noggit: LONG, NUMBER, BIGNUMBER, and BOOLEAN.  
> {{JsonLoader}} should set field value object types in the 
> {{SolrInputDocument}} according to the content stream's data types. 
> Currently {{JsonLoader}} converts all non-{{String}}-typed field values to 
> {{String}}-s.
> There is a comment in {{JsonLoader.parseSingleFieldValue()}}, where the 
> convert-everything-to-string logic happens, that says "for legacy reasons, 
> single values s are expected to be strings", but other content streams' type 
> information is not flattened like this, e.g. {{JavabinLoader}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to