[
https://issues.apache.org/jira/browse/LANG-1576?focusedWorklogId=463404&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-463404
]
ASF GitHub Bot logged work on LANG-1576:
----------------------------------------
Author: ASF GitHub Bot
Created on: 26/Jul/20 20:36
Start Date: 26/Jul/20 20:36
Worklog Time Spent: 10m
Work Description: aherbert commented on a change in pull request #565:
URL: https://github.com/apache/commons-lang/pull/565#discussion_r460568874
##########
File path: src/test/java/org/apache/commons/lang3/StringUtilsChompTest.java
##########
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.lang3;
+
+import java.util.concurrent.TimeUnit;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.State;
+import static org.apache.commons.lang3.StringUtils.EMPTY;
+import static org.apache.commons.lang3.StringUtils.isEmpty;
+import static org.apache.commons.lang3.StringUtils.length;
+
+/**
+ * Test to show whether using BitSet for removeAll() methods is faster than
using HashSet.
Review comment:
This is not what the benchmark is doing.
##########
File path: src/test/java/org/apache/commons/lang3/StringUtilsChompTest.java
##########
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.lang3;
+
+import java.util.concurrent.TimeUnit;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.State;
+import static org.apache.commons.lang3.StringUtils.EMPTY;
+import static org.apache.commons.lang3.StringUtils.isEmpty;
+import static org.apache.commons.lang3.StringUtils.length;
+
+/**
+ * Test to show whether using BitSet for removeAll() methods is faster than
using HashSet.
+ */
+@BenchmarkMode(Mode.AverageTime)
+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@State(Scope.Thread)
+public class StringUtilsChompTest {
+
+ private static final String string1 = "\r";
+ private static final String string2 = buildString2();
+
+ private static String buildString2() {
+ StringBuilder stringBuilder = new StringBuilder();
+ for (int i = 0; i < 100000; i++) {
+ stringBuilder.append('0');
+ stringBuilder.append('\r');
+ stringBuilder.append('0');
+ stringBuilder.append('\n');
+ stringBuilder.append('0');
+ stringBuilder.append('\r');
+ stringBuilder.append('\n');
+ }
+ return stringBuilder.toString();
+ }
+
+ @Benchmark
+ public void test1Old() {
+ chompOld(string1);
Review comment:
You should return String from all your benchmark methods. Otherwise
there is nothing to prevent the JVM from not running your chomp methods as the
result is discarded. If you return the chomped String then the JMH framework
will consume it thus forcing the JVM to run the method.
##########
File path: src/test/java/org/apache/commons/lang3/StringUtilsChompTest.java
##########
@@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.commons.lang3;
+
+import java.util.concurrent.TimeUnit;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.BenchmarkMode;
+import org.openjdk.jmh.annotations.Mode;
+import org.openjdk.jmh.annotations.OutputTimeUnit;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.State;
+import static org.apache.commons.lang3.StringUtils.EMPTY;
+import static org.apache.commons.lang3.StringUtils.isEmpty;
+import static org.apache.commons.lang3.StringUtils.length;
+
+/**
+ * Test to show whether using BitSet for removeAll() methods is faster than
using HashSet.
+ */
+@BenchmarkMode(Mode.AverageTime)
+@OutputTimeUnit(TimeUnit.NANOSECONDS)
+@State(Scope.Thread)
+public class StringUtilsChompTest {
+
+ private static final String string1 = "\r";
+ private static final String string2 = buildString2();
+
+ private static String buildString2() {
+ StringBuilder stringBuilder = new StringBuilder();
+ for (int i = 0; i < 100000; i++) {
+ stringBuilder.append('0');
+ stringBuilder.append('\r');
+ stringBuilder.append('0');
+ stringBuilder.append('\n');
+ stringBuilder.append('0');
+ stringBuilder.append('\r');
+ stringBuilder.append('\n');
+ }
+ return stringBuilder.toString();
+ }
+
+ @Benchmark
+ public void test1Old() {
+ chompOld(string1);
+ }
+
+ @Benchmark
+ public void test1New() {
+ chompNew(string1);
+ }
+
+ @Benchmark
+ public void test2Old() {
+ chompOld(string2);
Review comment:
These test2 method use a 100,000 * 7 character string but only chomp it
once. So you are not really testing anything but a chomp of a CR LF pair from
the end of a 700,000 character string. So the benchmark time difference (from
test1) is the time is for hitting a different memory cache level, and the
runtime difference of the single length input string code path used in test1.
Note you cannot continuously chomp the string in a loop as the method will
get stuck at the '0' character and not shorten it. This test is measuring
performance of the method, but perhaps not in the way you intended.
Ideally you should have test for strings ending with:
- CR
- LF
- CR + LF
- None of the above
With and without some preceding characters to hit the length 1 code path.
You then test the speed through all the execution paths.
If you want to avoid the CPU pipeline branch prediction from learning the
branch to take for each of your code paths then you should build the different
strings and chomp all of them in a single loop. Branch prediction will fall
down when the size of the data is >2000 so you would need maybe 10000 random
ending strings. These should be consumed by a JMH Blackhole to avoid the JVM
from optimising out the calling of a no-effect method:
```
String[] data = ...;
@Benchmark
public void testNew(Blackhole bh) {
for (final String s : data) {
bh.consume(chompNew(s));
}
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 463404)
Time Spent: 1h (was: 50m)
> refine StringUtils.chomp
> ------------------------
>
> Key: LANG-1576
> URL: https://issues.apache.org/jira/browse/LANG-1576
> Project: Commons Lang
> Issue Type: Sub-task
> Reporter: Jin Xu
> Priority: Minor
> Time Spent: 1h
> Remaining Estimate: 0h
>
> [https://github.com/apache/commons-lang/pull/565]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)