[
https://issues.apache.org/jira/browse/GEODE-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809206#comment-16809206
]
Darrel Schneider commented on GEODE-6579:
-----------------------------------------
As of java 9 using reflection to directly set the char[] is a non-starter. Jdk
9 has changed the "value" field on String from a char[] to a byte[]. By default
if the JVM is using a Latin character set then each character is stored as a
single byte.
We also discussed using the package protected String(char[], boolean)
constructor which is called by newStringUnsafe(char[]). In fact using
"newStringUnsafe" would have been the way to do this optimization for jdk 8.
But that method in jdk9 ends up copying the char[] into a byte[].
Our old code that used the deprecated String(byte[]) constructor is probably
the fastest way to create a String instance in jdk9 since the amount of garbage
produced by it will be half that of those who call String(char[], boolean).
Given how much the internal implementation of String changed from jdk8 to jdk9
I think the suggested optimization is a bad idea.
We have also considered trying to avoid garbage creation by instead having a
long lived byte[] that we reuse each time. Given that this byte[] has a very
short life, and will always be less than 65k in size, I'm not convinced that we
should try to avoid this garbage creation.
> Creating a String during deserialization could be optimized
> -----------------------------------------------------------
>
> Key: GEODE-6579
> URL: https://issues.apache.org/jira/browse/GEODE-6579
> Project: Geode
> Issue Type: Improvement
> Components: serialization
> Reporter: Darrel Schneider
> Assignee: Darrel Schneider
> Priority: Major
> Labels: optimization
> Time Spent: 50m
> Remaining Estimate: 0h
>
> When creating a string during deserialization from data that we know is in
> the ASCII character set (each character can be represented by one byte) we
> currently read all the bytes into a temporary byte array and then create a
> String instance by giving it that byte array. The String constructor has to
> create its own char array and then copy all the bytes into it. After that the
> byte array is garbage.
> We could instead directly create a char array, fill it by reading each byte
> from the DataInput into it and then using reflection to directly set this
> char array as the value field of the String instance we just created (as an
> empty String). This prevents an extra copy of the data and reduces garbage
> creation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)