I updated the webrev:
http://cr.openjdk.java.net/~igerasim/8058779/02/webrev/
In the check at 2300-2301 and 2351-2352 I replaced MAX_ARRAY_SIZE with
Integer.MAX_VALUE, which seems to be more accurate here.
And I want to add that this proposed implementation is not only faster,
but also more memory efficient.
The following simple stress-test shows that the proposed version is able
to handle twice larger strings, comparing to the current implementation.
----------------------------------------
public class C {
public static void main(String[] args) throws Throwable {
String s = "string";
for (int i = 1; i < Integer.MAX_VALUE; ++i) {
try {
s = s.replace("string", "stringstring");
} catch (OutOfMemoryError o) {
System.out.println(i + ") " + s.length());
break;
}
}
}
}
----------------------------------------
$ time ~/java9/jdk/bin/java -showversion -Xmx1g C
java version "1.9.0-ea"
Java(TM) SE Runtime Environment (build 1.9.0-ea-b63)
Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b63, mixed mode)
25) 100663296
real 0m4.525s
user 0m4.402s
sys 0m1.189s
$ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
java version "1.9.0-internal"
Java(TM) SE Runtime Environment (build
1.9.0-internal-igerasim_2015_05_23_19_25-b00)
Java HotSpot(TM) 64-Bit Server VM (build
1.9.0-internal-igerasim_2015_05_23_19_25-b00, mixed mode)
26) 201326592
real 0m2.139s
user 0m1.960s
sys 0m0.461s
Sincerely yours,
Ivan
On 24.05.2015 23:17, Ivan Gerasimov wrote:
Hello everybody!
I know many people here like it when the performance is getting better.
It was suggested to make the literal variant of String.replace() faster.
Currently, this method is implemented as a few calls to regexp API, so
that the whole implementation takes only two lines of code.
I've created two versions of the fix.
In the first one, we scan the string and store indices of the found
substrings in an array.
Then, we allocate the precisely sized char array and fill it it.
The case with the empty target has to be handled separately.
BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779
WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/00/webrev/
The second variant is much less verbose, however it's less efficient too.
Here the StringJoiner is used as an intermediate storage.
WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/
Here are the micro-benchmark results (in a string of ~300 chars do ~15
replacements).
0) Baseline
MyBenchmark.test thrpt 40 257'051.948 ± 4537.484 ops/s
1) Heavy-duty +308%
MyBenchmark.test thrpt 40 1'049'235.602 ± 15501.803 ops/s
2) StringJoiner +190%
MyBenchmark.test thrpt 40 746'000.629 ± 15387.036 ops/s
Personally, I like my first variant better, even though it adds almost
300 lines of code.
But I'd like to hear what people think of it.
Sincerely yours,
Ivan