Unexpected behaviour with larger Strings

Adam Retter Mon, 20 Apr 2020 10:13:26 -0700

Hi there,

I am not sure if the following is expected behaviour or possibly
indicates one or more bugs? Regardless, the behaviour was surprising
to me as it seems to vary between JVM versions and vendors.


The Java code is simply:

import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;
import static java.nio.charset.StandardCharsets.*;

public class GetBytesTest {

    public static void main(final String args[]) throws
UnsupportedEncodingException {

        for (final Charset charset : new Charset[]{ISO_8859_1,
US_ASCII, UTF_8}) {

            System.out.println("Attempting: " + charset.name());

            final String str = new String(new char[1073741824]);
            str.getBytes(charset);

            System.out.println("OK");
        }
    }
}


I have run this against multiple JDKs and versions by running:

$ javac GetBytesTest.java
$ java -Xmx10g -cp . GetBytesTest


My results on macOS 10.15.2 are:

1. Azul Zulu 7.34.0.5-CA-macosx build 24.242-b7
   Exception in thread "main" java.lang.OutOfMemoryError: Requested
array size exceeds VM limit
at java.lang.StringCoding.encode(StringCoding.java:350)
at java.lang.String.getBytes(String.java:939)
at GetBytesTest.main(GetBytesTest.java:15)

2. Oracle JDK 1.8.0_221-b11
   Exception in thread "main" java.lang.OutOfMemoryError: Requested
array size exceeds VM limit
at java.lang.StringCoding.encode(StringCoding.java:350)
at java.lang.String.getBytes(String.java:941)
at GetBytesTest.main(GetBytesTest.java:15)

3. AdoptOpenJDK 1.8.0_232-b09 with openj9-0.17.0
   ALL PASSES :-)

4. AdoptOpenJDK 1.8.0_252-b09
   Exception in thread "main" java.lang.OutOfMemoryError: Requested
array size exceeds VM limit
at java.lang.StringCoding.encode(StringCoding.java:350)
at java.lang.String.getBytes(String.java:941)
at GetBytesTest.main(GetBytesTest.java:15)

5. Azul Zulu 8.40.0.25-CA-macosx build 1.8.0_222-b10
   Exception in thread "main" java.lang.OutOfMemoryError: Requested
array size exceeds VM limit
at java.lang.StringCoding.encode(StringCoding.java:350)
at java.lang.String.getBytes(String.java:941)
at GetBytesTest.main(GetBytesTest.java:15)

6. Azul Zulu build 9.0.7.1+1
   Exception in thread "main" java.lang.NegativeArraySizeException
at java.base/java.lang.StringCoding.encodeUTF8(StringCoding.java:505)
at java.base/java.lang.StringCoding.encode(StringCoding.java:593)
at java.base/java.lang.String.getBytes(String.java:975)
at GetBytesTest.main(GetBytesTest.java:15)

7. Azul Zulu10.3+5 build 10.0.2+13
ALL PASSES :-)

8. AdoptOpenJDK 11.0.7+10 with openj9-0.20.0
   FAILS at ISO-8859-1 stage... also for fails for any of the character sets :-(

   Exception in thread "main" java.lang.OutOfMemoryError: UTF16 String
size is 1073741824, should be less than 1073741823
at java.base/java.lang.StringUTF16.newBytesFor(StringUTF16.java:49)
at java.base/java.lang.String.<init>(String.java:746)
at java.base/java.lang.String.<init>(String.java:695)
at GetBytesTest.main(GetBytesTest.java:15)

9. Azul Zulu11.37+17-CA build 11.0.6+10-LTS
   ALL PASSES :-)

10. Azul Zulu13.28+11-CA build 13.0.1+10-MTS
   ALL PASSES :-)

11. AdoptOpenJDK 13.0.2+8 with openj9-0.18.0
    Exception in thread "main" java.lang.NegativeArraySizeException
at java.base/java.lang.String.<init>(String.java:750)
at java.base/java.lang.String.<init>(String.java:699)
at GetBytesTest.main(GetBytesTest.java:14)


Regarding the code and the results - if you are concerned about
overall memory you can comment out all the character sets apart from
UTF-8 and you still get the same errors!
I was surprised that by my findings that:

1. On JDK 7 and 8 with HotSpot - getting the bytes of a UTF-8 string
where all chars are '0' wants to allocate an array larger than the VM
limit, whereas the same operation on ASCII and ISO-8859-1 do not. If I
am not mistaken then the char '0' takes up the same amount of bytes
(i.e. 1 byte) in ASCII, ISO-8559-1, andUTF-8.

2. JDK 8 with J9 (NOT HotSpot) - seems to work just fine, whereas
HotSpot fails, see (1).

3. JDK 9 with HotSpot seems to report a different error than those
versions before it.

4. JDK 11 with J9 (NOT HotSpot) will not let me allocate any String of
length 1073741824 for any of the character sets. In this area this
seems to be a change from JDK 8 with J9. Could this be considered a
regression?

5. This seems to be fixed on JDK 10 through 13 with HotSpot.

6. JDK 13 with J9 (NOT HotSpot) like JDK 11 with J9, will not let me
allocate the String. Seems to be a different error though. Likewise is
this a regression over JDK8 with J9?

7. Why is the version number for J9 in JDK13 less than that in JDK11?


Now obviously I am happy to see that this passes on JDK 10+ with
HotSpot :-) But shouldn't it also pass for J9? Also why is there such
a variation in the errors, I would have hoped that such simple Java
code was "portable".


Thanks :-)

-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk

Unexpected behaviour with larger Strings

Reply via email to