It was suggested to me off-list that the implementation should choose a
reasonable initial capacity value ,to size the StringBuilder, rather than
the value read from the stream ( in case of bad or corrupt data ). So the
proposed changes are:
diff --git a/src/java.base/share/classes/java/io/ObjectInputStream.java
b/src/java.base/share/classes/java/io/ObjectInputStream.java
--- a/src/java.base/share/classes/java/io/ObjectInputStream.java
+++ b/src/java.base/share/classes/java/io/ObjectInputStream.java
@@ -3144,7 +3144,9 @@
* utflen bytes.
*/
private String readUTFBody(long utflen) throws IOException {
- StringBuilder sbuf = new StringBuilder();
+ // a reasonably initial capacity based on the UTF length
+ int initialCapacity = Math.max((int)utflen, 10_000);
+ StringBuilder sbuf = new StringBuilder(initialCapacity);
if (!blkmode) {
end = pos = 0;
}
-Chris.
On 8 Feb 2016, at 11:15, Chris Hegarty <[email protected]> wrote:
> Low hanging fruit to avoid unnecessary allocations when deserializing.
> readUTF knows the string size ahead of the read and can avoid
> expandCapacity() by constructing the StringBuilder with the expected size.
>
> It is an implementation detail, but if the size is larger than
> Integer.MAX_VALUE,
> then OOM can be thrown, as is the case in the implementation today.
>
> diff --git a/src/java.base/share/classes/java/io/ObjectInputStream.java
> b/src/java.base/share/classes/java/io/ObjectInputStream.java
> --- a/src/java.base/share/classes/java/io/ObjectInputStream.java
> +++ b/src/java.base/share/classes/java/io/ObjectInputStream.java
> @@ -3144,7 +3144,9 @@
> * utflen bytes.
> */
> private String readUTFBody(long utflen) throws IOException {
> - StringBuilder sbuf = new StringBuilder();
> + if (utflen > Integer.MAX_VALUE)
> + throw new OutOfMemoryError("UTF length, " + utflen + ", too
> big.");
> + StringBuilder sbuf = new StringBuilder((int)utflen);
> if (!blkmode) {
> end = pos = 0;
> }
>
> -Chris.