alamb commented on issue #5942:
URL:
https://github.com/apache/arrow-datafusion/issues/5942#issuecomment-1507257020
So I have two interesting pieces of information:
1. I can reproduce the reported performance difference on my 8 core cloud
machine
I also see DataFusion using only a single core
```
+--------------+--------------+----------+-----------------+-------------------+---------------------+--------------------+--------------+----------+-------------+
| l_returnflag | l_linestatus | sum_qty | sum_base_price | sum_disc_price
| sum_charge | avg_qty | avg_price | avg_disc |
count_order |
+--------------+--------------+----------+-----------------+-------------------+---------------------+--------------------+--------------+----------+-------------+
| A | F | 37734107 | 56586554400.73 |
53758257134.8700 | 55909065222.827692 | 25.522005853257337 | 38273.129734 |
0.049985 | 1478493 |
| N | F | 991417 | 1487504710.38 | 1413082168.0541
| 1469649223.194375 | 25.516471920522985 | 38284.467760 | 0.050093 | 38854
|
| N | O | 74476040 | 111701729697.74 |
106118230307.6056 | 110367043872.497010 | 25.50222676958499 | 38249.117988 |
0.049996 | 2920374 |
| R | F | 37719753 | 56568041380.90 |
53741292684.6040 | 55889619119.831932 | 25.50579361269077 | 38250.854626 |
0.050009 | 1478870 |
+--------------+--------------+----------+-----------------+-------------------+---------------------+--------------------+--------------+----------+-------------+
4 rows in set. Query took 2.841 seconds.
❯
```
However, when I run datafusion against "datafusion created" parquet files
from https://github.com/apache/arrow-datafusion/tree/main/benchmarks it is 3x
faster though much less fast than hyper (like 100ms vs 928ms)
```
alamb@aal-dev:~/tpch_data/parquet_data_SF1$ datafusion-cli -f
~/sql_olap_bench/q1.txt
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
| l_returnflag | l_linestatus | sum_qty | sum_base_price |
sum_disc_price | sum_charge | avg_qty | avg_price | avg_disc |
count_order |
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
| A | F | 37734107.00 | 56586554400.73 |
53758257134.8700 | 55909065222.827692 | 25.522005 | 38273.129734 | 0.049985 |
1478493 |
| N | F | 991417.00 | 1487504710.38 |
1413082168.0541 | 1469649223.194375 | 25.516471 | 38284.467760 | 0.050093 |
38854 |
| N | O | 74476040.00 | 111701729697.74 |
106118230307.6056 | 110367043872.497010 | 25.502226 | 38249.117988 | 0.049996 |
2920374 |
| R | F | 37719753.00 | 56568041380.90 |
53741292684.6040 | 55889619119.831932 | 25.505793 | 38250.854626 | 0.050009 |
1478870 |
+--------------+--------------+-------------+-----------------+-------------------+---------------------+-----------+--------------+----------+-------------+
4 rows in set. Query took 0.928 seconds.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]