[NEW BUG] NumberFormat.parse fails in some scenarios

Christoph Dreis Wed, 26 Aug 2020 07:21:56 -0700

Hi,

A colleague of mine ([email protected]) approached me today that 
his code wasn’t working that converted a currency String into cents.
Apparently, the code worked with Java 8 while it didn’t with 11+.


public class Main {

        public static void main(String[] args) throws IOException {
                // System.setProperty("java.locale.providers", "JRE");
                System.out.println(getPriceInCents(Locale.GERMANY, "9,99 €"));
        }

        static int getPriceInCents(Locale locale, String price) {
                try {
                        DecimalFormat format = (DecimalFormat) 
NumberFormat.getCurrencyInstance(locale);
                        Number number = format.parse(price);
                        return (int) (number.doubleValue() * 100);
                } catch (ParseException e) {
                                           // This should be thrown on JDK 9+
                        System.out.println(e);
                }
                return 0;
        }

}

After some digging I think this is caused by the changes done for 
JDK-8008577[1].
When I change the java.locale.providers property to "JRE" for example, it works 
again.

My investigations so far revealed that apparently the CLDR number pattern for 
the currency slightly differs.

I created breakpoints in 
sun.util.locale.provider.NumberFormatProviderImpl::getInstance() to display 
some things:

        LocaleProviderAdapter adapter = LocaleProviderAdapter.forType(type);
        String[] numberPatterns = 
adapter.getLocaleResources(override).getNumberPatterns();
        DecimalFormatSymbols symbols = 
DecimalFormatSymbols.getInstance(override);
        int entry = (choice == INTEGERSTYLE) ? NUMBERSTYLE : choice;
        DecimalFormat format = new DecimalFormat(numberPatterns[entry], 
symbols);

        // CLDR (type) 
        // #,##0.00 ¤ (numberPatterns[entry])
        // [35,44,35,35,48,46,48,48,-62,-96,-62,-92] (numberPatterns[entry] in 
bytes)

        //
        // JRE type
        // #,##0.00 ¤;-#,##0.00 ¤ (numberPatterns[entry])
        // 
[35,44,35,35,48,46,48,48,32,-62,-92,59,45,35,44,35,35,48,46,48,48,32,-62,-92] 
(numberPatterns[entry] in bytes)

The JRE one includes the negative pattern, but the more interesting bit is that 
apparently the spacing differs here.
For JRE it seems to be a normal space (the 32), but for CLDR it's showing [-62, 
-96] which seems to be a non breaking space aka nbsp.

Ultimately this leads to a check failing in DecimalFormat when parsing the 
string "9,99 €" that obviously includes a normal space.

            if (gotPositive) {
                // the regionMatches will return false because nbsp != space
                gotPositive = text.regionMatches(position,positiveSuffix,0,
                                                 positiveSuffix.length());
            }

Which itself leads to the following in our case:

        // fail if neither or both
        if (gotPositive == gotNegative) {
            parsePosition.errorIndex = position;
            // We hit this part here which causes the parsing to fail
            return false;
        }


There are workarounds - e.g. by setting java.locale.providers as already 
mentioned or setting format.setPositiveSuffix(" €"); to fix this particular 
case.

Is this a bug or a feature or are we missing something?

In case this is an actual bug we would appreciate a "reported-by" mentioning in 
an eventual fix.

Thanks in advance. I do hope you can follow my thoughts in this email.

[1] https://bugs.openjdk.java.net/browse/JDK-8008577

Cheers,
Christoph

[NEW BUG] NumberFormat.parse fails in some scenarios

Reply via email to