[ 
https://issues.apache.org/jira/browse/SOLR-13580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-13580.
-----------------------------
    Resolution: Not A Bug

> java 13 changes to locale specific Numeric parsing rules affect ParseNumeric 
> UpdateProcessors when using 'local' config option  - notably affects French
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13580
>                 URL: https://issues.apache.org/jira/browse/SOLR-13580
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>            Priority: Major
>              Labels: Java13
>         Attachments: SOLR-13580.patch
>
>
> Per [JDK-8221432|https://bugs.openjdk.java.net/browse/JDK-8221432] Java13 has 
> updated to [CLDR 35.1|http://cldr.unicode.org/] – which controls the 
> definition of language & locale specific formatting characters – in a 
> non-backwards compatible way due to "French" changes in [CLDR 
> 34|http://cldr.unicode.org/index/downloads/cldr-34#TOC-Detailed-Data-Changes]
> This impacts people who use any of the "ParseNumeric" UpdateProcessors in 
> conjunction with the "locale=fr" or "locale=fr_FR" init param and expect the 
> (pre java13) existing behavior of treating U+00A0 (NO BREAK SPACE) as a 
> "grouping" character (ie: between thousands and million, between millions and 
> billions, etc...). Starting with java13 the JVM expects U+202F (NARROW NO 
> BREAK SPACE) in it's place.
> Notably: upgrading to jdk13-ea+26 caused failures in Solr's 
> ParsingFieldUpdateProcessorsTest which was initially had hardcoded test data 
> that used U+00A0. ParsingFieldUpdateProcessorsTest has since been updated to 
> account for this discrepency by modifying the test data used to determine the 
> "expected" character for the current JVM, but there is nothing Solr or the 
> ParseNumeric UpdateProcessors can do to help mitigate this change in behavior 
> for end users who upgrade to java13.
> Affected users with U+00A0 characters in their incoming SolrInputDocuments 
> will see the ParseNumeric UpdateProcessors (configured with locale=fr...) 
> "skip" these values as unparsable, most likely resulting in a failure to 
> index into a numeric field since the original "String" value will be left as 
> is.
> Affected users may want to consider updating their configs to include a 
> {{RegexReplaceProcessorFactory}} configured to strip out all whitespace 
> characters, prior to any ParseNumeric update processors configured expect 
> french langauge numbers
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to