Re: More memory-efficient internal representation for Strings: call for more data

Douglas Surber Tue, 02 Dec 2014 15:25:23 -0800

String construction is a big performance issue for JDBC drivers. Mostqueries return some number of Strings. The overwhelming majority ofthose Strings will be short lived. The cost of constructing theseStrings from network bytes is a large fraction of total executiontime. Any increase in the cost of constructing a String will far outweigh any reduction in memory use, at least for query results.

All of the proposed compression methods require an additional scan ofthe entire string. That's exactly the wrong direction. Something likethe following pseudo-code is common inside a driver.


  {
    char[] c = new char[n];
    for (i = 0; i < n; i++) c[i] = charSource.next();
    return new String(c);
  }

The array copy inside the String constructor is a significantfraction of JDBC driver execution time. Adding an additional scan ontop of it is making things worse regardless of the transient benefitof more compact storage. In the case of a query result the Stringwill be likely never be promoted out of new space; the benefit ofcompression would be minimal.

I don't dispute that Strings occupy a significant fraction of theheap or that a lot of those bytes are zero. And I certainly agreethat reducing memory footprint is valuable, but any worsening ofString construction time will likely be a problem.


Douglas

At 02:13 PM 12/2/2014, core-libs-dev-requ...@openjdk.java.net wrote:

Date: Wed, 03 Dec 2014 00:59:10 +0300
From: Aleksey Shipilev <aleksey.shipi...@oracle.com>
To: Java Core Libs <core-libs-dev@openjdk.java.net>
Cc: charlie hunt <charlie.h...@oracle.com>
Subject: More memory-efficient internal representation for Strings:
        call for        more data
Message-ID: <547e362e.5010...@oracle.com>
Content-Type: text/plain; charset=utf-8

Hi,

As you may already know, we are looking into more memory efficient
representation for Strings:
 https://bugs.openjdk.java.net/browse/JDK-8054307
As part of preliminary performance work for this JEP, we have tocollect
the empirical data on usual characteristics of Strings and char[]-s
normal applications have, as well as figure out the early estimatesforthe improvements based on that data. What we have so far is writtenup here:
http://cr.openjdk.java.net/~shade/density/string-density-report.pdf
We would appreciate if people who are interested in this JEP canprovidethe additional data on their applications. It is double-interestingto
have the data for the applications that process String data outside
Latin1 plane. Our current data says these cases are rather rare.Pleaseread the current report draft, and try to process your own heapdumps
using the instructions in the Appendix.

Thanks,
-Aleksey.

Re: More memory-efficient internal representation for Strings: call for more data

Reply via email to