zhuqi-lucas commented on PR #14766:
URL: https://github.com/apache/datafusion/pull/14766#issuecomment-2667624674
1. The memory usage now is accurate, it will not collect all result to
memory.
2. We now register datafusion-cli result batch to memory pool also.
The testing result for the 10G memory case, now it's 5G peak memory:
```rust
/usr/bin/time -l cargo run --release -- --mem-pool-type fair -m 5G --maxrows
10 -f '/Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql'
Compiling datafusion-cli v45.0.0
(/Users/zhuqi/arrow-datafusion/datafusion-cli)
Finished `release` profile [optimized] target(s) in 6m 06s
Running `/Users/zhuqi/arrow-datafusion/target/release/datafusion-cli
--mem-pool-type fair -m 5G --maxrows 10 -f
/Users/zhuqi/arrow-datafusion/benchmarks/data/external_sort.sql`
DataFusion CLI v45.0.0
0 row(s) fetched.
Elapsed 0.006 seconds.
memory pool: FairSpillPool { pool_size: 5368709120, state: Mutex { data:
FairSpillPoolState { num_spill: 0, spillable: 0, unspillable: 0 } } }
+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
| l_orderkey | l_partkey | l_suppkey | l_linenumber | l_quantity |
l_extendedprice | l_discount | l_tax | l_shipdate | l_commitdate |
l_receiptdate |
+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
| 1 | 1551894 | 76910 | 1 | 17.00 | 33078.94
| 0.04 | 0.02 | 1996-03-13 | 1996-02-12 | 1996-03-22 |
| 1 | 673091 | 73092 | 2 | 36.00 | 38306.16
| 0.09 | 0.06 | 1996-04-12 | 1996-02-28 | 1996-04-20 |
| 1 | 636998 | 36999 | 3 | 8.00 | 15479.68
| 0.10 | 0.02 | 1996-01-29 | 1996-03-05 | 1996-01-31 |
| 1 | 21315 | 46316 | 4 | 28.00 | 34616.68
| 0.09 | 0.06 | 1996-04-21 | 1996-03-30 | 1996-05-16 |
| 1 | 240267 | 15274 | 5 | 24.00 | 28974.00
| 0.10 | 0.04 | 1996-03-30 | 1996-03-14 | 1996-04-01 |
| 1 | 156345 | 6348 | 6 | 32.00 | 44842.88
| 0.07 | 0.02 | 1996-01-30 | 1996-02-07 | 1996-02-03 |
| 2 | 1061698 | 11719 | 1 | 38.00 | 63066.32
| 0.00 | 0.05 | 1997-01-28 | 1997-01-14 | 1997-02-02 |
| 3 | 42970 | 17971 | 1 | 45.00 | 86083.65
| 0.06 | 0.00 | 1994-02-02 | 1994-01-04 | 1994-02-23 |
| 3 | 190355 | 65359 | 2 | 49.00 | 70822.15
| 0.10 | 0.00 | 1993-11-09 | 1993-12-20 | 1993-11-24 |
| 3 | 1284483 | 34508 | 3 | 27.00 | 39620.34
| 0.06 | 0.07 | 1994-01-16 | 1993-11-22 | 1994-01-23 |
| .
|
| .
|
| .
|
+------------+-----------+-----------+--------------+------------+-----------------+------------+-------+------------+--------------+---------------+
81920 row(s) fetched. (First 10 displayed. Use --maxrows to adjust)
Elapsed 2.165 seconds.
373.37 real 9.21 user 5.91 sys
5073829888 maximum resident set size
0 average shared memory size
0 average unshared data size
0 average unshared stack size
1293674 page reclaims
0 page faults
0 swaps
0 block input operations
0 block output operations
0 messages sent
0 messages received
0 signals received
1906 voluntary context switches
85462 involuntary context switches
200845261488 instructions retired
55793294693 cycles elapsed
5072421856 peak memory footprint
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]