Server Performance a écrit :
Hello, this is my first collaboration to OpenJDK so sorry if I missed some step... And sorry for my English :-( This is my proposal to be discussed:THE GOAL: Boost the overall String concatenation / append operations. BACKGROUND / HISTORY: • At the beginning (JDK 1.0 days) we had String.concat() and StringBuffer to build Strings. Both approaches had initially bad performance. • Starting at JDK 1.4 (I think), a share-on-copy strategy was introduced in StringBuffer. The performance gain was obvious, but increased the needed head and in some cases produced some memory leak when reusing StringBuffer. • Starting at JDK 1.5, StringBuilder was introduced as the “unsyncronized version”, but also the copy-on-write optimization was undo, becoming an always copy scenario. Also, the String + operator is translated to StringBuilder.append() by javac. This has been discussed but no better alternative was found (see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6219959 ) • This current implementation generates several System.arraycopy() calls: at least one per append/insert/delete (two if expanding capacity) and a final one in the toString() method. STUDYING THE USES: • If we look at the uses of StringBuilder (both inside JDK code, in application server and/or final applications), in nearly 99% of times its is only used to create a String in a single-threaded context and (the most important fact) only using the append() and toString() methods. • Also, only in 5% of the instantiatings, the coder establishes the initial capacity. Many times doesn’t matter, but other times it is impossible to guess it or calculate it. And even worst: some times the coder fails in his guess: establishes to much initial capacity or too few.
There is no method append() but lot of methods append.
MY PROPOSAL: • Create a new class java.lang.StringAppender implements Appendable 1. Mostly same in its exposed public constructors and methods than StringBuilder, but the only operations are the “append()” ones (no insert, no delete, no replace) 2. Internally represented as a String array 3. Only arraycopy() or create char arrays once, inside the toString() method (well, this isn’t completely true: also arraycopies when appending objects/variables other than String instances or char arrays, but the most typical operation is appending strings!)
You right about the fact the major use case is to use lot of append in a loop but i not agree about the fact that this append is always a append(String) or a append(Object). append(char), append(int) are very popular too and doesn't work well with your implementation. So i don't think if it worth a new class.
Instead of using new constructors, i think implementing a new method named join4. Doesn’t need to stablish an initial capacity. Never more calculating itor guessing it. Always • Add a new constructor in the java.lang.String class (actually 5 newconstructors for performance reasons, see below): 1. public String(String... strs) 2. public String(String str0, String str1) 3. public String(String str0, String str1, String str2) 4. public String(String str0, String str1, String str2, String str3) (NOTE: these 3 additional constructors are needed to boost appends of a small number of Strings, in which case the overload of creating the array and then looping inside is much greater than passing 2, 3 or 4 parameters in the constructor invocation).
see http://bugs.sun.com/view_bug.do?bug_id=5015163 is better. "".join("hello","world"); => "helloworld" It can be implemented exactly in the same way that your method String.copyValuesInto() but i think it is more usefull.
• Change the javac behavior: the String + operator must be translated into “new String(String... )” instead of “new StringBuilder().append().append()... ..toString()” • Revise other JDK sourcecodes to use StringAppender, and the rest of programs all around the world. (By the way in the Glassfish V2 sourcecode I see several String.concat() invocations; seems strange to me... ) • So the new blueprints for String concatenation should be: 1. For append-only, not conditional concatenations, use the new String constructor. Example: String result = new String(part1, part2, part3, part4); 2. For append-only, conditional or looped concatenations, use the StringAppender class. 3. For other manipulations (insert, delete, replace), use StringBuilder 4. For a thread-safe version, use StringBuffer THE BOOST: As you can see in my microbenchmark results, executed in Linux x64 and Windows 32 bits (-server, -client, and -XX:+AggressiveOpts versions), we can achieve a boost between 1% and 167% (depends on the scenario and architecture). Well those values are the extremes, the typical gains go between 20% and 70%. I think these results are good enough to be taken into consideration :-) THE SOURCE CODE: See attachments, String.java.diff with the added code (it is clear), and StringAppender with the new proposed class.
About your code:
Please use String[] var instead of String var[], i know this is legal even
int f() [] { return new int[]{3}; }
is legal, but it's not the Java way.
In StringAppender.expandListCapacity should use Arrays.copyOf().
StringAppender.size() should not be public, it's error prone and
not very usefull.
THE MICROBENCHMARK CODE: See attachment. Of course should be revised. I think I have made it correctly. THE MICROBENCHMARK RESULTS (varied to me about +/-1% in different executions due to the host load or whatever): See attached file. I think they are great... What do you think? Best regards, --Jesús Viñuales
cheers, Rémi Forax
