omaraloraini commented on pull request #2336:
URL: https://github.com/apache/iceberg/pull/2336#issuecomment-804451815


   @rdblue, I really could not experiment with any change, because I don't have 
any numbers to compare it to. I'm still trying to figure out how to test it, I 
don't think I can conclude from profiling alone that it performs better using  
the number of samples. I tried to run some of JMH benchmarks in Iceberg repo, 
sometimes they work, other times I get an error related to shadowJar and jmh 
plugins, some error with `META-INF`. I tried to use JMH in my code base, but 
for some reason I could not get it to work.
   
   Note that non of the existing frameworks in the code base use it, so if you 
ran some test you would get the same numbers, in particular the following files 
use the `UnboxedReader`:
   
   - FlinkParquetReaders.ReadBuilder
   - BaseParquetReaders.ReadBuilder
   - ParquetAvroValueReaders.ReadBuilder
   - PigParquetReader.ReadBuilder
   - SparkParquetReaders.ReadBuilder
   
   My plan was to get the API first, then update them afterwards. I will update 
them soon.
   
   Regarding the flame graphs, this is the old one with two methods underlined.
   
![before](https://user-images.githubusercontent.com/5780138/112067786-b80b8b00-8b79-11eb-9d15-e8cb1f64db83.png)
   
   From the left `UnboxedReader.read` indicate that there are no benefits in 
`UnboxedReader`, as all calls end up going to the interface method, which 
returns an object. In the far right there is `get` which in case of a Java 
primitive, you would get a boxed primitive, the result of `get`  is passed to 
the `UnboxedReader` and it ignores it. The signature of read method `public T 
read(T ignored)`. After getting the result from the reader it's passed to the 
method `set` in `StructReader`.
   
   I will update other readers to use the new implementation and run the test 
you mentioned. If you have any suggestion regarding benchmarking, let me know, 
as it is a new ground for me.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to