[
https://issues.apache.org/jira/browse/SOLR-13580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man resolved SOLR-13580.
-----------------------------
Resolution: Not A Bug
> java 13 changes to locale specific Numeric parsing rules affect ParseNumeric
> UpdateProcessors when using 'local' config option - notably affects French
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-13580
> URL: https://issues.apache.org/jira/browse/SOLR-13580
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
> Assignee: Hoss Man
> Priority: Major
> Labels: Java13
> Attachments: SOLR-13580.patch
>
>
> Per [JDK-8221432|https://bugs.openjdk.java.net/browse/JDK-8221432] Java13 has
> updated to [CLDR 35.1|http://cldr.unicode.org/] – which controls the
> definition of language & locale specific formatting characters – in a
> non-backwards compatible way due to "French" changes in [CLDR
> 34|http://cldr.unicode.org/index/downloads/cldr-34#TOC-Detailed-Data-Changes]
> This impacts people who use any of the "ParseNumeric" UpdateProcessors in
> conjunction with the "locale=fr" or "locale=fr_FR" init param and expect the
> (pre java13) existing behavior of treating U+00A0 (NO BREAK SPACE) as a
> "grouping" character (ie: between thousands and million, between millions and
> billions, etc...). Starting with java13 the JVM expects U+202F (NARROW NO
> BREAK SPACE) in it's place.
> Notably: upgrading to jdk13-ea+26 caused failures in Solr's
> ParsingFieldUpdateProcessorsTest which was initially had hardcoded test data
> that used U+00A0. ParsingFieldUpdateProcessorsTest has since been updated to
> account for this discrepency by modifying the test data used to determine the
> "expected" character for the current JVM, but there is nothing Solr or the
> ParseNumeric UpdateProcessors can do to help mitigate this change in behavior
> for end users who upgrade to java13.
> Affected users with U+00A0 characters in their incoming SolrInputDocuments
> will see the ParseNumeric UpdateProcessors (configured with locale=fr...)
> "skip" these values as unparsable, most likely resulting in a failure to
> index into a numeric field since the original "String" value will be left as
> is.
> Affected users may want to consider updating their configs to include a
> {{RegexReplaceProcessorFactory}} configured to strip out all whitespace
> characters, prior to any ParseNumeric update processors configured expect
> french langauge numbers
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]