mayursrivastava commented on pull request #2286:
URL: https://github.com/apache/iceberg/pull/2286#issuecomment-828391883
Hi @rymurr, I recorded results from master branch as well. Here's the result:
```
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k
# Run progress: 0.00% complete, ETA 00:00:00
# Fork: 1 of 1
# Warmup Iteration 1: 1.761 s/op
# Warmup Iteration 2: 1.464 s/op
# Warmup Iteration 3: 1.621 s/op
Iteration 1: 1.545 s/op
Iteration 2: 1.493 s/op
Iteration 3: 1.479 s/op
Iteration 4: 1.460 s/op
Iteration 5: 1.471 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k":
N = 5
mean = 1.490 ±(99.9%) 0.128 s/op
Histogram, s/op:
[1.460, 1.465) = 1
[1.465, 1.470) = 0
[1.470, 1.475) = 1
[1.475, 1.480) = 1
[1.480, 1.485) = 0
[1.485, 1.490) = 0
[1.490, 1.495) = 1
[1.495, 1.500) = 0
[1.500, 1.505) = 0
[1.505, 1.510) = 0
[1.510, 1.515) = 0
[1.515, 1.520) = 0
[1.520, 1.525) = 0
[1.525, 1.530) = 0
[1.530, 1.535) = 0
[1.535, 1.540) = 0
[1.540, 1.545) = 0
[1.545, 1.550) = 1
Percentiles, s/op:
p(0.0000) = 1.460 s/op
p(50.0000) = 1.479 s/op
p(90.0000) = 1.545 s/op
p(95.0000) = 1.545 s/op
p(99.0000) = 1.545 s/op
p(99.9000) = 1.545 s/op
p(99.9900) = 1.545 s/op
p(99.9990) = 1.545 s/op
p(99.9999) = 1.545 s/op
p(100.0000) = 1.545 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k
# Run progress: 6.25% complete, ETA 00:49:08
# Fork: 1 of 1
# Warmup Iteration 1: 1.683 s/op
# Warmup Iteration 2: 1.314 s/op
# Warmup Iteration 3: 1.309 s/op
Iteration 1: 1.329 s/op
Iteration 2: 1.319 s/op
Iteration 3: 1.289 s/op
Iteration 4: 1.264 s/op
Iteration 5: 1.271 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k":
N = 5
mean = 1.294 ±(99.9%) 0.111 s/op
Histogram, s/op:
[1.260, 1.265) = 1
[1.265, 1.270) = 0
[1.270, 1.275) = 1
[1.275, 1.280) = 0
[1.280, 1.285) = 0
[1.285, 1.290) = 1
[1.290, 1.295) = 0
[1.295, 1.300) = 0
[1.300, 1.305) = 0
[1.305, 1.310) = 0
[1.310, 1.315) = 0
[1.315, 1.320) = 1
[1.320, 1.325) = 0
[1.325, 1.330) = 1
Percentiles, s/op:
p(0.0000) = 1.264 s/op
p(50.0000) = 1.289 s/op
p(90.0000) = 1.329 s/op
p(95.0000) = 1.329 s/op
p(99.0000) = 1.329 s/op
p(99.9000) = 1.329 s/op
p(99.9900) = 1.329 s/op
p(99.9990) = 1.329 s/op
p(99.9999) = 1.329 s/op
p(100.0000) = 1.329 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k
# Run progress: 12.50% complete, ETA 00:45:16
# Fork: 1 of 1
# Warmup Iteration 1: 8.414 s/op
# Warmup Iteration 2: 7.756 s/op
# Warmup Iteration 3: 7.869 s/op
Iteration 1: 8.740 s/op
Iteration 2: 8.667 s/op
Iteration 3: 8.597 s/op
Iteration 4: 8.597 s/op
Iteration 5: 8.610 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k":
N = 5
mean = 8.642 ±(99.9%) 0.239 s/op
Histogram, s/op:
[8.500, 8.525) = 0
[8.525, 8.550) = 0
[8.550, 8.575) = 0
[8.575, 8.600) = 2
[8.600, 8.625) = 1
[8.625, 8.650) = 0
[8.650, 8.675) = 1
[8.675, 8.700) = 0
[8.700, 8.725) = 0
[8.725, 8.750) = 1
[8.750, 8.775) = 0
[8.775, 8.800) = 0
Percentiles, s/op:
p(0.0000) = 8.597 s/op
p(50.0000) = 8.610 s/op
p(90.0000) = 8.740 s/op
p(95.0000) = 8.740 s/op
p(99.0000) = 8.740 s/op
p(99.9000) = 8.740 s/op
p(99.9900) = 8.740 s/op
p(99.9990) = 8.740 s/op
p(99.9999) = 8.740 s/op
p(100.0000) = 8.740 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k
# Run progress: 18.75% complete, ETA 00:45:51
# Fork: 1 of 1
# Warmup Iteration 1: 9.147 s/op
# Warmup Iteration 2: 8.526 s/op
# Warmup Iteration 3: 8.513 s/op
Iteration 1: 8.478 s/op
Iteration 2: 8.419 s/op
Iteration 3: 8.455 s/op
Iteration 4: 8.421 s/op
Iteration 5: 8.450 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k":
N = 5
mean = 8.444 ±(99.9%) 0.096 s/op
Histogram, s/op:
[8.410, 8.415) = 0
[8.415, 8.420) = 1
[8.420, 8.425) = 1
[8.425, 8.430) = 0
[8.430, 8.435) = 0
[8.435, 8.440) = 0
[8.440, 8.445) = 0
[8.445, 8.450) = 1
[8.450, 8.455) = 0
[8.455, 8.460) = 1
[8.460, 8.465) = 0
[8.465, 8.470) = 0
[8.470, 8.475) = 0
[8.475, 8.480) = 1
Percentiles, s/op:
p(0.0000) = 8.419 s/op
p(50.0000) = 8.450 s/op
p(90.0000) = 8.478 s/op
p(95.0000) = 8.478 s/op
p(99.0000) = 8.478 s/op
p(99.9000) = 8.478 s/op
p(99.9900) = 8.478 s/op
p(99.9990) = 8.478 s/op
p(99.9999) = 8.478 s/op
p(100.0000) = 8.478 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k
# Run progress: 25.00% complete, ETA 00:45:34
# Fork: 1 of 1
# Warmup Iteration 1: 3.148 s/op
# Warmup Iteration 2: 2.701 s/op
# Warmup Iteration 3: 2.624 s/op
Iteration 1: 2.559 s/op
Iteration 2: 2.559 s/op
Iteration 3: 2.532 s/op
Iteration 4: 2.481 s/op
Iteration 5: 2.658 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k":
N = 5
mean = 2.558 ±(99.9%) 0.248 s/op
Histogram, s/op:
[2.400, 2.425) = 0
[2.425, 2.450) = 0
[2.450, 2.475) = 0
[2.475, 2.500) = 1
[2.500, 2.525) = 0
[2.525, 2.550) = 1
[2.550, 2.575) = 2
[2.575, 2.600) = 0
[2.600, 2.625) = 0
[2.625, 2.650) = 0
[2.650, 2.675) = 1
Percentiles, s/op:
p(0.0000) = 2.481 s/op
p(50.0000) = 2.559 s/op
p(90.0000) = 2.658 s/op
p(95.0000) = 2.658 s/op
p(99.0000) = 2.658 s/op
p(99.9000) = 2.658 s/op
p(99.9900) = 2.658 s/op
p(99.9990) = 2.658 s/op
p(99.9999) = 2.658 s/op
p(100.0000) = 2.658 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k
# Run progress: 31.25% complete, ETA 00:41:41
# Fork: 1 of 1
# Warmup Iteration 1: 3.058 s/op
# Warmup Iteration 2: 2.484 s/op
# Warmup Iteration 3: 2.453 s/op
Iteration 1: 2.407 s/op
Iteration 2: 2.359 s/op
Iteration 3: 2.380 s/op
Iteration 4: 2.342 s/op
Iteration 5: 2.373 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k":
N = 5
mean = 2.372 ±(99.9%) 0.093 s/op
Histogram, s/op:
[2.340, 2.345) = 1
[2.345, 2.350) = 0
[2.350, 2.355) = 0
[2.355, 2.360) = 1
[2.360, 2.365) = 0
[2.365, 2.370) = 0
[2.370, 2.375) = 1
[2.375, 2.380) = 0
[2.380, 2.385) = 1
[2.385, 2.390) = 0
[2.390, 2.395) = 0
[2.395, 2.400) = 0
[2.400, 2.405) = 0
[2.405, 2.410) = 1
Percentiles, s/op:
p(0.0000) = 2.342 s/op
p(50.0000) = 2.373 s/op
p(90.0000) = 2.407 s/op
p(95.0000) = 2.407 s/op
p(99.0000) = 2.407 s/op
p(99.9000) = 2.407 s/op
p(99.9900) = 2.407 s/op
p(99.9990) = 2.407 s/op
p(99.9999) = 2.407 s/op
p(100.0000) = 2.407 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k
# Run progress: 37.50% complete, ETA 00:37:45
# Fork: 1 of 1
# Warmup Iteration 1: 3.058 s/op
# Warmup Iteration 2: 2.513 s/op
# Warmup Iteration 3: 2.499 s/op
Iteration 1: 2.392 s/op
Iteration 2: 2.402 s/op
Iteration 3: 2.346 s/op
Iteration 4: 2.378 s/op
Iteration 5: 2.371 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k":
N = 5
mean = 2.378 ±(99.9%) 0.083 s/op
Histogram, s/op:
[2.340, 2.345) = 0
[2.345, 2.350) = 1
[2.350, 2.355) = 0
[2.355, 2.360) = 0
[2.360, 2.365) = 0
[2.365, 2.370) = 0
[2.370, 2.375) = 1
[2.375, 2.380) = 1
[2.380, 2.385) = 0
[2.385, 2.390) = 0
[2.390, 2.395) = 1
[2.395, 2.400) = 0
[2.400, 2.405) = 1
[2.405, 2.410) = 0
Percentiles, s/op:
p(0.0000) = 2.346 s/op
p(50.0000) = 2.378 s/op
p(90.0000) = 2.402 s/op
p(95.0000) = 2.402 s/op
p(99.0000) = 2.402 s/op
p(99.9000) = 2.402 s/op
p(99.9900) = 2.402 s/op
p(99.9990) = 2.402 s/op
p(99.9999) = 2.402 s/op
p(100.0000) = 2.402 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k
# Run progress: 43.75% complete, ETA 00:33:53
# Fork: 1 of 1
# Warmup Iteration 1: 2.855 s/op
# Warmup Iteration 2: 2.304 s/op
# Warmup Iteration 3: 2.241 s/op
Iteration 1: 2.252 s/op
Iteration 2: 2.181 s/op
Iteration 3: 2.219 s/op
Iteration 4: 2.197 s/op
Iteration 5: 2.190 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k":
N = 5
mean = 2.208 ±(99.9%) 0.110 s/op
Histogram, s/op:
[2.180, 2.185) = 1
[2.185, 2.190) = 1
[2.190, 2.195) = 0
[2.195, 2.200) = 1
[2.200, 2.205) = 0
[2.205, 2.210) = 0
[2.210, 2.215) = 0
[2.215, 2.220) = 1
[2.220, 2.225) = 0
[2.225, 2.230) = 0
[2.230, 2.235) = 0
[2.235, 2.240) = 0
[2.240, 2.245) = 0
[2.245, 2.250) = 0
[2.250, 2.255) = 1
[2.255, 2.260) = 0
Percentiles, s/op:
p(0.0000) = 2.181 s/op
p(50.0000) = 2.197 s/op
p(90.0000) = 2.252 s/op
p(95.0000) = 2.252 s/op
p(99.0000) = 2.252 s/op
p(99.9000) = 2.252 s/op
p(99.9900) = 2.252 s/op
p(99.9990) = 2.252 s/op
p(99.9999) = 2.252 s/op
p(100.0000) = 2.252 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k
# Run progress: 50.00% complete, ETA 00:30:02
# Fork: 1 of 1
# Warmup Iteration 1: 3.129 s/op
# Warmup Iteration 2: 2.522 s/op
# Warmup Iteration 3: 2.564 s/op
Iteration 1: 2.434 s/op
Iteration 2: 2.478 s/op
Iteration 3: 2.394 s/op
Iteration 4: 2.419 s/op
Iteration 5: 2.425 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k":
N = 5
mean = 2.430 ±(99.9%) 0.118 s/op
Histogram, s/op:
[2.390, 2.395) = 1
[2.395, 2.400) = 0
[2.400, 2.405) = 0
[2.405, 2.410) = 0
[2.410, 2.415) = 0
[2.415, 2.420) = 1
[2.420, 2.425) = 0
[2.425, 2.430) = 1
[2.430, 2.435) = 1
[2.435, 2.440) = 0
[2.440, 2.445) = 0
[2.445, 2.450) = 0
[2.450, 2.455) = 0
[2.455, 2.460) = 0
[2.460, 2.465) = 0
[2.465, 2.470) = 0
[2.470, 2.475) = 0
Percentiles, s/op:
p(0.0000) = 2.394 s/op
p(50.0000) = 2.425 s/op
p(90.0000) = 2.478 s/op
p(95.0000) = 2.478 s/op
p(99.0000) = 2.478 s/op
p(99.9000) = 2.478 s/op
p(99.9900) = 2.478 s/op
p(99.9990) = 2.478 s/op
p(99.9999) = 2.478 s/op
p(100.0000) = 2.478 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k
# Run progress: 56.25% complete, ETA 00:26:23
# Fork: 1 of 1
# Warmup Iteration 1: 3.047 s/op
# Warmup Iteration 2: 2.405 s/op
# Warmup Iteration 3: 2.374 s/op
Iteration 1: 2.341 s/op
Iteration 2: 2.271 s/op
Iteration 3: 2.329 s/op
Iteration 4: 2.268 s/op
Iteration 5: 2.300 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k":
N = 5
mean = 2.302 ±(99.9%) 0.126 s/op
Histogram, s/op:
[2.260, 2.265) = 0
[2.265, 2.270) = 1
[2.270, 2.275) = 1
[2.275, 2.280) = 0
[2.280, 2.285) = 0
[2.285, 2.290) = 0
[2.290, 2.295) = 0
[2.295, 2.300) = 0
[2.300, 2.305) = 1
[2.305, 2.310) = 0
[2.310, 2.315) = 0
[2.315, 2.320) = 0
[2.320, 2.325) = 0
[2.325, 2.330) = 1
[2.330, 2.335) = 0
[2.335, 2.340) = 0
[2.340, 2.345) = 1
Percentiles, s/op:
p(0.0000) = 2.268 s/op
p(50.0000) = 2.300 s/op
p(90.0000) = 2.341 s/op
p(95.0000) = 2.341 s/op
p(99.0000) = 2.341 s/op
p(99.9000) = 2.341 s/op
p(99.9900) = 2.341 s/op
p(99.9990) = 2.341 s/op
p(99.9999) = 2.341 s/op
p(100.0000) = 2.341 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k
# Run progress: 62.50% complete, ETA 00:22:36
# Fork: 1 of 1
# Warmup Iteration 1: 3.328 s/op
# Warmup Iteration 2: 2.795 s/op
# Warmup Iteration 3: 2.839 s/op
Iteration 1: 2.648 s/op
Iteration 2: 3.041 s/op
Iteration 3: 2.635 s/op
Iteration 4: 2.645 s/op
Iteration 5: 2.689 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k":
N = 5
mean = 2.732 ±(99.9%) 0.671 s/op
Histogram, s/op:
[2.600, 2.650) = 3
[2.650, 2.700) = 1
[2.700, 2.750) = 0
[2.750, 2.800) = 0
[2.800, 2.850) = 0
[2.850, 2.900) = 0
[2.900, 2.950) = 0
[2.950, 3.000) = 0
[3.000, 3.050) = 1
Percentiles, s/op:
p(0.0000) = 2.635 s/op
p(50.0000) = 2.648 s/op
p(90.0000) = 3.041 s/op
p(95.0000) = 3.041 s/op
p(99.0000) = 3.041 s/op
p(99.9000) = 3.041 s/op
p(99.9900) = 3.041 s/op
p(99.9990) = 3.041 s/op
p(99.9999) = 3.041 s/op
p(100.0000) = 3.041 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k
# Run progress: 68.75% complete, ETA 00:18:52
# Fork: 1 of 1
# Warmup Iteration 1: 3.161 s/op
# Warmup Iteration 2: 2.457 s/op
# Warmup Iteration 3: 2.408 s/op
Iteration 1: 2.344 s/op
Iteration 2: 2.322 s/op
Iteration 3: 2.351 s/op
Iteration 4: 2.307 s/op
Iteration 5: 2.310 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k":
N = 5
mean = 2.327 ±(99.9%) 0.076 s/op
Histogram, s/op:
[2.300, 2.305) = 0
[2.305, 2.310) = 1
[2.310, 2.315) = 1
[2.315, 2.320) = 0
[2.320, 2.325) = 1
[2.325, 2.330) = 0
[2.330, 2.335) = 0
[2.335, 2.340) = 0
[2.340, 2.345) = 1
[2.345, 2.350) = 0
[2.350, 2.355) = 1
Percentiles, s/op:
p(0.0000) = 2.307 s/op
p(50.0000) = 2.322 s/op
p(90.0000) = 2.351 s/op
p(95.0000) = 2.351 s/op
p(99.0000) = 2.351 s/op
p(99.9000) = 2.351 s/op
p(99.9900) = 2.351 s/op
p(99.9990) = 2.351 s/op
p(99.9999) = 2.351 s/op
p(100.0000) = 2.351 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k
# Run progress: 75.00% complete, ETA 00:15:07
# Fork: 1 of 1
# Warmup Iteration 1: 4.964 s/op
# Warmup Iteration 2: 4.274 s/op
# Warmup Iteration 3: 4.244 s/op
Iteration 1: 4.167 s/op
Iteration 2: 4.172 s/op
Iteration 3: 4.127 s/op
Iteration 4: 4.146 s/op
Iteration 5: 4.169 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k":
N = 5
mean = 4.156 ±(99.9%) 0.075 s/op
Histogram, s/op:
[4.120, 4.125) = 0
[4.125, 4.130) = 1
[4.130, 4.135) = 0
[4.135, 4.140) = 0
[4.140, 4.145) = 0
[4.145, 4.150) = 1
[4.150, 4.155) = 0
[4.155, 4.160) = 0
[4.160, 4.165) = 0
[4.165, 4.170) = 2
[4.170, 4.175) = 1
Percentiles, s/op:
p(0.0000) = 4.127 s/op
p(50.0000) = 4.167 s/op
p(90.0000) = 4.172 s/op
p(95.0000) = 4.172 s/op
p(99.0000) = 4.172 s/op
p(99.9000) = 4.172 s/op
p(99.9900) = 4.172 s/op
p(99.9990) = 4.172 s/op
p(99.9999) = 4.172 s/op
p(100.0000) = 4.172 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k
# Run progress: 81.25% complete, ETA 00:11:22
# Fork: 1 of 1
# Warmup Iteration 1: 4.747 s/op
# Warmup Iteration 2: 4.092 s/op
# Warmup Iteration 3: 4.072 s/op
Iteration 1: 4.013 s/op
Iteration 2: 3.928 s/op
Iteration 3: 4.017 s/op
Iteration 4: 3.945 s/op
Iteration 5: 3.990 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k":
N = 5
mean = 3.978 ±(99.9%) 0.155 s/op
Histogram, s/op:
[3.920, 3.930) = 1
[3.930, 3.940) = 0
[3.940, 3.950) = 1
[3.950, 3.960) = 0
[3.960, 3.970) = 0
[3.970, 3.980) = 0
[3.980, 3.990) = 0
[3.990, 4.000) = 1
[4.000, 4.010) = 0
[4.010, 4.020) = 2
Percentiles, s/op:
p(0.0000) = 3.928 s/op
p(50.0000) = 3.990 s/op
p(90.0000) = 4.017 s/op
p(95.0000) = 4.017 s/op
p(99.0000) = 4.017 s/op
p(99.9000) = 4.017 s/op
p(99.9900) = 4.017 s/op
p(99.9990) = 4.017 s/op
p(99.9999) = 4.017 s/op
p(100.0000) = 4.017 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k
# Run progress: 87.50% complete, ETA 00:07:36
# Fork: 1 of 1
# Warmup Iteration 1: 2.066 s/op
# Warmup Iteration 2: 1.622 s/op
# Warmup Iteration 3: 1.602 s/op
Iteration 1: 1.546 s/op
Iteration 2: 1.503 s/op
Iteration 3: 1.542 s/op
Iteration 4: 1.507 s/op
Iteration 5: 1.599 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k":
N = 5
mean = 1.539 ±(99.9%) 0.150 s/op
Histogram, s/op:
[1.500, 1.510) = 2
[1.510, 1.520) = 0
[1.520, 1.530) = 0
[1.530, 1.540) = 0
[1.540, 1.550) = 2
[1.550, 1.560) = 0
[1.560, 1.570) = 0
[1.570, 1.580) = 0
[1.580, 1.590) = 0
[1.590, 1.600) = 1
Percentiles, s/op:
p(0.0000) = 1.503 s/op
p(50.0000) = 1.542 s/op
p(90.0000) = 1.599 s/op
p(95.0000) = 1.599 s/op
p(99.0000) = 1.599 s/op
p(99.9000) = 1.599 s/op
p(99.9900) = 1.599 s/op
p(99.9990) = 1.599 s/op
p(99.9999) = 1.599 s/op
p(100.0000) = 1.599 s/op
# JMH version: 1.21
# VM version: JDK 1.8.0_282, OpenJDK 64-Bit Server VM, 25.282-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: <none>
# Warmup: 3 iterations, single-shot each
# Measurement: 5 iterations, single-shot each
# Timeout: 10 min per iteration
# Threads: 1 thread
# Benchmark mode: Single shot invocation time
# Benchmark:
org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k
# Run progress: 93.75% complete, ETA 00:03:47
# Fork: 1 of 1
# Warmup Iteration 1: 2.126 s/op
# Warmup Iteration 2: 1.573 s/op
# Warmup Iteration 3: 1.538 s/op
Iteration 1: 1.509 s/op
Iteration 2: 1.482 s/op
Iteration 3: 1.497 s/op
Iteration 4: 1.487 s/op
Iteration 5: 1.447 s/op
Result
"org.apache.iceberg.spark.source.parquet.vectorized.VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k":
N = 5
mean = 1.485 ±(99.9%) 0.090 s/op
Histogram, s/op:
[1.440, 1.445) = 0
[1.445, 1.450) = 1
[1.450, 1.455) = 0
[1.455, 1.460) = 0
[1.460, 1.465) = 0
[1.465, 1.470) = 0
[1.470, 1.475) = 0
[1.475, 1.480) = 0
[1.480, 1.485) = 1
[1.485, 1.490) = 1
[1.490, 1.495) = 0
[1.495, 1.500) = 1
[1.500, 1.505) = 0
[1.505, 1.510) = 1
Percentiles, s/op:
p(0.0000) = 1.447 s/op
p(50.0000) = 1.487 s/op
p(90.0000) = 1.509 s/op
p(95.0000) = 1.509 s/op
p(99.0000) = 1.509 s/op
p(99.9000) = 1.509 s/op
p(99.9900) = 1.509 s/op
p(99.9990) = 1.509 s/op
p(99.9999) = 1.509 s/op
p(100.0000) = 1.509 s/op
# Run complete. Total time: 01:00:29
REMEMBER: The numbers below are just data. To gain reusable insights, you
need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof),
design factorial
experiments, perform baseline and negative tests that provide experimental
control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews
from the domain experts.
Do not assume the numbers tell you what you want them to tell.
Benchmark
Mode Cnt Score Error Units
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k
ss 5 1.490 ± 0.128 s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k
ss 5 1.294 ± 0.111 s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k
ss 5 8.642 ± 0.239 s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k
ss 5 8.444 ± 0.096 s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k
ss 5 2.558 ± 0.248 s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k
ss 5 2.372 ± 0.093 s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k
ss 5 2.378 ± 0.083 s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k
ss 5 2.208 ± 0.110 s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k
ss 5 2.430 ± 0.118 s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k
ss 5 2.302 ± 0.126 s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k
ss 5 2.732 ± 0.671 s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k
ss 5 2.327 ± 0.076 s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k
ss 5 4.156 ± 0.075 s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k
ss 5 3.978 ± 0.155 s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k
ss 5 1.539 ± 0.150 s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k
ss 5 1.485 ± 0.090 s/op
Benchmark result is saved to
XXX/iceberg_master/spark2/build/reports/jmh/results.txt
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]