himanshug commented on issue #8031: remove unnecessary synchronization overhead 
from complex Aggregators
URL: 
https://github.com/apache/incubator-druid/issues/8031#issuecomment-509840466
 
 
   I ran this benchmark 
   ```
   @State(Scope.Benchmark)
   public class SynchronizeOverhead
   {
     private int n = 1000000;
   
     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.MICROSECONDS)
     public void unsynchronizedAdd(Blackhole blackhole)
     {
       for (int i = 0; i < n; i++) {
         blackhole.consume(1);
       }
     }
   
     @Benchmark
     @BenchmarkMode(Mode.AverageTime)
     @OutputTimeUnit(TimeUnit.MICROSECONDS)
     public void synchronizedAdd(Blackhole blackhole)
     {
       for (int i = 0; i < n; i++) {
         synchronized(this) {
           blackhole.consume(1);
         }
       }
     }
   
     public static void main(String[] args) throws RunnerException
     {
       Options opt = new OptionsBuilder()
           .include(SynchronizeOverhead.class.getSimpleName())
           .warmupIterations(5)
           .measurementIterations(10)
           .forks(1)
           .build();
   
       new Runner(opt).run();
     }
   }
   ```
   
   which gave me...
   ```
   Benchmark                              Mode  Cnt      Score     Error  Units
   SynchronizeOverhead.synchronizedAdd    avgt   10  18509.829 ± 372.391  us/op
   SynchronizeOverhead.unsynchronizedAdd  avgt   10   1828.303 ±  50.494  us/op
   ```
   
   Sidenote/Caution: I did run it on my mac laptop(which is not ideal for 
running benchmarks) as I didn't have a linux or EC2 instance handy, but I did 5 
warmup and 10 measurement iterations, so it is not too bad for high level 
understanding.
   
   that says that there is about 15 ms overhead for 1mn  lock/release on object 
locks. I am pretty sure this is negligible compared to time spent doing sketch 
operations in all those synchronized blocks. So, I don't think this change will 
have any measurable impact on performance overall.
   (In fact, I am not entirely sure but It is also possible that server jvm can 
further dynamically optimize the lock away when its JIT profiling discovers 
that synchronized block is never accessed concurrently, that will make the 
overhead go away altogether)
   
   so, code for this proposal  could only be useful for reducing some confusion 
for extension writers where this question keeps coming up that they have 
unnecessary synchronization overhead inside their aggregator implementations. 
It might be good enough for facts getting documented in this proposal so that 
those worries can be alleviated by just pointing those conversations to this 
page.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to