lidavidm commented on pull request #11189:
URL: https://github.com/apache/arrow/pull/11189#issuecomment-923201576
Though for what it's worth, arrowbench gives rather similar results locally
using `as.data.frame(run_benchmark(dataset_taxi_parquet, n_iter=5,
cpu_count=c(4)))`.
Before:
```
Total run time: 9.956319 secs
iteration process real start_mem_bytes end_mem_bytes
1 1 2.979338342 0.785497665 2493300736 3698921472
2 2 2.886093707 0.766516447 3699429376 3714109440
3 3 2.920944454 0.776514053 3714109440 3714109440
4 4 2.877046089 0.766922712 3714109440 3714109440
5 5 2.896827784 0.770490646 3714109440 3714109440
6 1 18.316171634 4.632888794 2493300736 3777204224
7 2 17.657202597 4.614435434 3731472384 3815968768
8 3 17.561911295 4.556154490 3744194560 3779891200
9 4 17.775040099 4.648452759 3744194560 3815968768
10 5 17.586604237 4.563611269 3744194560 3779891200
11 1 0.964031791 0.582021236 2493296640 3450097664
12 2 0.771914299 0.414866209 3342176256 3463372800
13 3 0.770634774 0.410928965 3347357696 3527225344
14 4 0.798272907 0.439264059 3352600576 3523477504
15 5 0.795549228 0.436603546 3357843456 3523641344
16 1 0.488568257 0.068626642 2493296640 3346874368
17 2 0.001445519 0.001448631 3346874368 3346874368
18 3 0.001467639 0.001470804 3346874368 3346874368
19 4 0.001425503 0.001428366 3346874368 3346874368
20 5 0.001460951 0.001464367 3346874368 3346874368
max_mem_bytes gc_level0 gc_level1 gc_level2 query cpu_count
1 3720196096 1 0 0 vignette 4
2 3720196096 0 0 0 vignette 4
3 3720196096 0 0 0 vignette 4
4 3720196096 0 0 0 vignette 4
5 3720196096 0 0 0 vignette 4
6 3796340736 2 0 1 payment_type_3 4
7 3815968768 1 0 1 payment_type_3 4
8 3815968768 1 0 0 payment_type_3 4
9 3815968768 0 0 1 payment_type_3 4
10 3815968768 1 0 0 payment_type_3 4
11 3481989120 2 1 3 small_no_files 4
12 3513794560 2 0 1 small_no_files 4
13 3527225344 1 0 1 small_no_files 4
14 3527225344 2 0 1 small_no_files 4
15 3527225344 0 1 1 small_no_files 4
16 3346874368 0 0 0 count_rows 4
17 3346874368 0 0 0 count_rows 4
18 3346874368 0 0 0 count_rows 4
19 3346874368 0 0 0 count_rows 4
20 3346874368 0 0 0 count_rows 4
```
After:
```
Total run time: 9.995613 secs
iteration process real start_mem_bytes end_mem_bytes
1 1 3.097148086 0.810223818 2493313024 3759751168
2 2 2.987976486 0.778924227 3760332800 3779207168
3 3 2.969162397 0.776125669 3779207168 3779207168
4 4 2.969802084 0.777535439 3779207168 3779207168
5 5 2.967313487 0.775563955 3779207168 3779207168
6 1 18.042212503 4.552006245 2493313024 3805003776
7 2 17.391559404 4.544502735 3759271936 3832758272
8 3 17.281636477 4.486674547 3760984064 3796680704
9 4 17.363466488 4.545516968 3760984064 3832758272
10 5 17.310964707 4.487994194 3760984064 3796680704
11 1 1.036147931 0.602855206 2493313024 3547631616
12 2 0.797990419 0.419062376 3439710208 3601801216
13 3 0.793205374 0.414812803 3485786112 3683479552
14 4 0.801588050 0.428945065 3508854784 3674488832
15 5 0.834174869 0.443217278 3508854784 3674656768
16 1 0.500276311 0.069972277 2493321216 3346898944
17 2 0.001478722 0.001481056 3346898944 3346898944
18 3 0.001478401 0.001481295 3346898944 3346898944
19 4 0.001472831 0.001475811 3346898944 3346898944
20 5 0.001461316 0.001464128 3346898944 3346898944
max_mem_bytes gc_level0 gc_level1 gc_level2 query cpu_count
1 3759751168 1 0 0 vignette 4
2 3779207168 0 0 0 vignette 4
3 3779207168 0 0 0 vignette 4
4 3779207168 0 0 0 vignette 4
5 3779207168 0 0 0 vignette 4
6 3824140288 2 0 1 payment_type_3 4
7 3832758272 1 0 1 payment_type_3 4
8 3832758272 1 0 0 payment_type_3 4
9 3832758272 0 0 1 payment_type_3 4
10 3832758272 1 0 0 payment_type_3 4
11 3579523072 2 1 3 small_no_files 4
12 3652222976 2 0 1 small_no_files 4
13 3683479552 1 0 1 small_no_files 4
14 3683479552 2 0 1 small_no_files 4
15 3683479552 0 1 1 small_no_files 4
16 3414007808 0 0 0 count_rows 4
17 3414007808 0 0 0 count_rows 4
18 3414007808 0 0 0 count_rows 4
19 3414007808 0 0 0 count_rows 4
20 3414007808 0 0 0 count_rows 4
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]