richardstartin commented on PR #2118:
URL: https://github.com/apache/helix/pull/2118#issuecomment-1137543675

   > Thanks @richardstartin - code changes look good.
   > 
   > Do we have any performance numbers? Because Helix code has been using 
String.format at many places and so if we can estimate the improvements, then 
we can further optimize.
   > 
   > thanks, komal
   
   Hi @desaikomal, I can't get you numbers from a Pinot cluster because I would 
need to back port these changes to 0.9.x to set it up; I'm hoping we can reap a 
performance improvement once the upgrade is done. I'm working on experience in 
Java performance optimisation and I can promise each one of these changes is a 
"sure thing." Replacing `containsKey`/`get` with `computeIfAbsent` intuitively 
halves the work, hashing without calling `toString` is also intuitively cheaper.
   
   However, `String.format` would indeed be a rich seam for Helix engineers to 
work on, so it's worth sharing some representative numbers:
   
   ```java
   @State(Scope.Benchmark)
   public class BenchmarkStringFormat {
   
     public static final String RESOURCE_DN_KEY = "resourceName";
     static final String INSTANCE_DN_KEY = "instanceName";
     public static final String CLUSTER_DN_KEY = "cluster";
   
     @Param("resource1234")
     String _resourceName;
     @Param("instance1234")
     String _instanceName;
     @Param("cluster1234")
     String _clusterName;
   
     @Benchmark
     public String format() {
       return String.format("%s=%s,%s=%s,%s=%s", CLUSTER_DN_KEY, _clusterName, 
INSTANCE_DN_KEY, _instanceName,
           RESOURCE_DN_KEY, _resourceName);
     }
   
     @Benchmark
     public String concat() {
       return CLUSTER_DN_KEY + "=" + _clusterName + ","
           + INSTANCE_DN_KEY + "=" + _instanceName + ","
           + RESOURCE_DN_KEY + "=" + _resourceName;
     }
   }
   ```
   
   JDK11, my MacBook Pro, no attempt made to control noise:
   ```
   Benchmark                     (_clusterName)  (_instanceName)  
(_resourceName)  Mode  Cnt     Score    Error  Units
   BenchmarkStringFormat.concat     cluster1234     instance1234     
resource1234  avgt    5    35.098 ±  3.046  ns/op
   BenchmarkStringFormat.format     cluster1234     instance1234     
resource1234  avgt    5  1263.995 ± 69.025  ns/op
   ```
   
   String concatenation, when all the arguments are known statically, is 
actually very efficient on recent JVMs - compiling to JDK7-JDK8 bytecode levels 
would generate the equivalent `StringBuilder` calls, from JDK9 onwards an 
`invokedynamic` instruction is generated, which leads to generation of 
specialised concatenation code at runtime. On the other hand, it's very 
difficult for JDK engineers to optimise `String.format` without breaking API 
compatibility. Though it has got a little faster on JDK17, removing 
`String.format` where concatenation or `StringBuilder` can be used would save a 
lot of money at scale.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to