AHeise edited a comment on pull request #13027:
URL: https://github.com/apache/flink/pull/13027#issuecomment-668501769


   > Could you summarize what the problem was that caused the performance 
regression?
   
   For some reason, not setting a parent classloader in the wrapper is much 
slower. I cannot see an obvious reason and added a couple of measurements.
   
   First, I wanted to know if the performance difference was caused by slower 
class loading. So I measured the total time spent in `ClassLoader#loadClass` 
for the previous version and this version. I found no difference. There is also 
no difference by the amount of classes loaded.
   
   Second, a profiler revealed that much time in the slow version was spent in 
`Class#newInstance`. I added the time measurements to 
`org.apache.flink.table.runtime.generated.GeneratedClass`. Interestingly, there 
is again no difference in compiling the class. There is a huge difference in 
calling `newInstance` on the generated class though that can easily account for 
the time difference. 
   
   Without the parent, creating an instance of the 4 specific blink operators 
`BatchNestedLoopJoin`, `LongHashJoinOperator`, `LocalHashAggregateWithKeys`, 
and `HashAggregateWithKeys` slows down over time taking up over 10s to finish 
at the end of the. 
   
   Interestingly, they use only the instantiation method with explicit 
arguments `GeneratedClass#newInstance(ClassLoader classLoader, Object... 
args)`. However, there are also operators that use that method that are 
"well-behaved".
   
   I have also forced class resolution through reflection on the generated 
classes, but resolution is fast in all cases. It's really just about creating 
the instances.
   
   edit: log of the timed measurements without parent: 
https://gist.github.com/AHeise/50375144fb6d6da7acb324544722e10b ; I can also 
provide the full logs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to