AdamGS commented on PR #22462:
URL: https://github.com/apache/datafusion/pull/22462#issuecomment-4525825948

   reworked the benchmarks and they are much nicer now and cover more cases. 
They also include some logic that can be used for other benchmarks around 
Parquet stats in the future.
   ```
   parquet_metadata_statistics/metadata_full_col_8_rg_1
                           time:   [7.8150 µs 7.8257 µs 7.8375 µs]
                           change: [−49.844% −49.709% −49.575%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   parquet_metadata_statistics/metadata_full_col_8_rg_32
                           time:   [19.557 µs 19.635 µs 19.719 µs]
                           change: [−31.058% −30.756% −30.435%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   parquet_metadata_statistics/metadata_full_col_8_rg_128
                           time:   [53.199 µs 53.319 µs 53.456 µs]
                           change: [−20.035% −19.700% −19.330%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) low mild
     6 (6.00%) high mild
     2 (2.00%) high severe
   parquet_metadata_statistics/metadata_full_col_64_rg_1
                           time:   [66.350 µs 66.445 µs 66.557 µs]
                           change: [−50.876% −50.713% −50.556%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) low mild
     4 (4.00%) high mild
     3 (3.00%) high severe
   parquet_metadata_statistics/metadata_full_col_64_rg_32
                           time:   [152.25 µs 152.65 µs 153.15 µs]
                           change: [−34.144% −33.914% −33.690%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     3 (3.00%) high mild
     2 (2.00%) high severe
   Benchmarking parquet_metadata_statistics/metadata_full_col_64_rg_128: 
Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 6.4s, enable flat sampling, or reduce sample count to 60.
   parquet_metadata_statistics/metadata_full_col_64_rg_128
                           time:   [426.46 µs 431.80 µs 438.79 µs]
                           change: [−21.906% −21.086% −20.271%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     3 (3.00%) high mild
     4 (4.00%) high severe
   parquet_metadata_statistics/metadata_full_col_256_rg_1
                           time:   [364.88 µs 365.58 µs 366.31 µs]
                           change: [−49.293% −48.963% −48.647%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) low mild
     6 (6.00%) high mild
     2 (2.00%) high severe
   Benchmarking parquet_metadata_statistics/metadata_full_col_256_rg_32: 
Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 8.9s, enable flat sampling, or reduce sample count to 50.
   parquet_metadata_statistics/metadata_full_col_256_rg_32
                           time:   [793.94 µs 796.79 µs 799.76 µs]
                           change: [−35.183% −34.000% −32.948%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 10 outliers among 100 measurements (10.00%)
     7 (7.00%) high mild
     3 (3.00%) high severe
   parquet_metadata_statistics/metadata_full_col_256_rg_128
                           time:   [2.7749 ms 2.8016 ms 2.8311 ms]
                           change: [−18.346% −17.368% −16.405%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_8_rg_1
                           time:   [3.0218 µs 3.0287 µs 3.0354 µs]
                           change: [−1.5245% −1.0760% −0.6204%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 7 outliers among 100 measurements (7.00%)
     3 (3.00%) low mild
     3 (3.00%) high mild
     1 (1.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_8_rg_32
                           time:   [37.171 µs 37.259 µs 37.346 µs]
                           change: [−6.8862% −6.4140% −5.9716%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     3 (3.00%) low mild
     1 (1.00%) high mild
     1 (1.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_8_rg_128
                           time:   [71.247 µs 71.441 µs 71.641 µs]
                           change: [−4.4617% −3.9458% −3.4404%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     4 (4.00%) high mild
     1 (1.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_64_rg_1
                           time:   [24.902 µs 24.941 µs 24.983 µs]
                           change: [+0.0278% +0.3191% +0.6074%] (p = 0.03 < 
0.05)
                           Change within noise threshold.
   Found 12 outliers among 100 measurements (12.00%)
     4 (4.00%) low mild
     4 (4.00%) high mild
     4 (4.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_64_rg_32
                           time:   [306.45 µs 307.22 µs 308.04 µs]
                           change: [−5.5007% −4.9854% −4.4696%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 12 outliers among 100 measurements (12.00%)
     2 (2.00%) low mild
     3 (3.00%) high mild
     7 (7.00%) high severe
   Benchmarking parquet_metadata_statistics/metadata_mixed_col_64_rg_128: 
Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 6.5s, enable flat sampling, or reduce sample count to 60.
   parquet_metadata_statistics/metadata_mixed_col_64_rg_128
                           time:   [583.24 µs 584.52 µs 585.83 µs]
                           change: [−4.6866% −4.0490% −3.5136%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     3 (3.00%) high mild
     3 (3.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_256_rg_1
                           time:   [173.84 µs 174.25 µs 174.70 µs]
                           change: [−1.4822% −0.7702% −0.0522%] (p = 0.04 < 
0.05)
                           Change within noise threshold.
   Found 10 outliers among 100 measurements (10.00%)
     2 (2.00%) low mild
     3 (3.00%) high mild
     5 (5.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_256_rg_32
                           time:   [1.4408 ms 1.4542 ms 1.4687 ms]
                           change: [−8.1275% −6.9286% −5.7943%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 9 outliers among 100 measurements (9.00%)
     8 (8.00%) high mild
     1 (1.00%) high severe
   parquet_metadata_statistics/metadata_mixed_col_256_rg_128
                           time:   [3.1768 ms 3.1939 ms 3.2129 ms]
                           change: [−15.720% −14.510% −13.283%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 7 outliers among 100 measurements (7.00%)
     2 (2.00%) high mild
     5 (5.00%) high severe
   parquet_metadata_statistics/metadata_none_col_8_rg_1
                           time:   [3.0249 µs 3.0304 µs 3.0362 µs]
                           change: [−2.6646% −2.3106% −1.9615%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   parquet_metadata_statistics/metadata_none_col_8_rg_32
                           time:   [4.6349 µs 4.6467 µs 4.6587 µs]
                           change: [−2.9848% −2.5834% −2.1817%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     4 (4.00%) low mild
     2 (2.00%) high mild
   parquet_metadata_statistics/metadata_none_col_8_rg_128
                           time:   [9.7857 µs 9.8335 µs 9.8789 µs]
                           change: [−1.6055% −1.0128% −0.3802%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   parquet_metadata_statistics/metadata_none_col_64_rg_1
                           time:   [24.831 µs 24.863 µs 24.898 µs]
                           change: [−2.2535% −1.9447% −1.6384%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     1 (1.00%) low mild
     5 (5.00%) high mild
     2 (2.00%) high severe
   parquet_metadata_statistics/metadata_none_col_64_rg_32
                           time:   [34.315 µs 34.422 µs 34.527 µs]
                           change: [+0.3927% +0.6664% +0.9535%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) low mild
     4 (4.00%) high mild
   parquet_metadata_statistics/metadata_none_col_64_rg_128
                           time:   [65.464 µs 65.898 µs 66.312 µs]
                           change: [+0.8826% +1.4562% +2.0139%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 2 outliers among 100 measurements (2.00%)
     2 (2.00%) high mild
   parquet_metadata_statistics/metadata_none_col_256_rg_1
                           time:   [167.74 µs 170.48 µs 173.04 µs]
                           change: [−5.4287% −3.9660% −2.3609%] (p = 0.00 < 
0.05)
                           Performance has improved.
   parquet_metadata_statistics/metadata_none_col_256_rg_32
                           time:   [247.99 µs 251.69 µs 255.70 µs]
                           change: [−0.3098% +1.9875% +4.3010%] (p = 0.08 > 
0.05)
                           No change in performance detected.
   Found 4 outliers among 100 measurements (4.00%)
     3 (3.00%) high mild
     1 (1.00%) high severe
   Benchmarking parquet_metadata_statistics/metadata_none_col_256_rg_128: 
Warming up for 3.0000 s
   Warning: Unable to complete 100 samples in 5.0s. You may wish to increase 
target time to 9.2s, enable flat sampling, or reduce sample count to 50.
   parquet_metadata_statistics/metadata_none_col_256_rg_128
                           time:   [569.18 µs 573.20 µs 577.52 µs]
                           change: [−1.3984% +1.0214% +3.5434%] (p = 0.45 > 
0.05)
                           No change in performance detected.
   Found 12 outliers among 100 measurements (12.00%)
     7 (7.00%) high mild
     5 (5.00%) high severe
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to