nastra edited a comment on pull request #3040:
URL: https://github.com/apache/iceberg/pull/3040#issuecomment-923687747


   > It looks like `readDatesIcebergVectorized5k` takes 30% longer? And it 
looks like the difference between `readFloatsIcebergVectorized5k` is just 
outside where the error ranges overlap.
   
   The variability in the timings come from the fact because I was doing dev 
work while running those tests on my local machine. The other thing is that 
with `Mode.SingleShotTime` benchmarks we're effectively measuring the **cold** 
performance (we do 3 warmup iterations and 5 measurement iterations). Below is 
an excerpt from the Javadoc taken from 
[here](http://javadox.com/org.openjdk.jmh/jmh-core/0.8/org/openjdk/jmh/annotations/Mode.html#SingleShotTime):
   
   > Single shot time: measures the time for a single operation.
   
   > Runs by calling {@link Benchmark} once and measuring its time. This mode 
is useful to estimate the "cold" performance when > you don't want to hide the 
warmup invocations, or if you want to see the progress from call to call, or 
you want to record every > single sample. This mode is work-based, and will run 
only for a single invocation of {@link Benchmark} method.
   
   > Caveats for this mode include:
   
   > More warmup/measurement iterations are generally required.
   > Timers overhead might be significant if benchmarks are small; switch to 
{@link #SampleTime} mode if that is a problem.
   
   I also did another run on this branch and below are the new results:
   
   ```
   Benchmark                                                                 
Mode  Cnt  Score   Error  Units
   VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k         
ss    5  1.870 ± 0.091   s/op
   VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k           
ss    5  1.511 ± 0.080   s/op
   VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      
ss    5  9.005 ± 0.539   s/op
   VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        
ss    5  8.424 ± 0.462   s/op
   VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       
ss    5  2.829 ± 0.152   s/op
   VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k         
ss    5  2.385 ± 0.129   s/op
   VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        
ss    5  2.429 ± 0.120   s/op
   VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k          
ss    5  2.357 ± 0.125   s/op
   VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      
ss    5  2.434 ± 0.158   s/op
   VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k        
ss    5  2.587 ± 0.160   s/op
   VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k         
ss    5  2.857 ± 0.138   s/op
   VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k           
ss    5  2.638 ± 0.161   s/op
   VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k       
ss    5  5.662 ± 0.411   s/op
   VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k         
ss    5  4.693 ± 0.183   s/op
   VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    
ss    5  1.993 ± 0.139   s/op
   VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      
ss    5  1.942 ± 0.053   s/op
   
   ```
   
[vectorized-read-flat-parquet-data-result-bump-arrow2.txt](https://github.com/apache/iceberg/files/7201221/vectorized-read-flat-parquet-data-result-bump-arrow2.txt)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to