jinchengchenghh opened a new issue, #9543:
URL: https://github.com/apache/incubator-gluten/issues/9543
### Description
For count(distinct a), sum(b) like TPCDS Q95, Now spark plan is group by and
then aggregate, but velox can support aggregate.distinct = true in
HashAggregation, here may have some potential to optimize.
```
Calling CudfHashJoinProbe::getOutput
I20250506 14:04:44.365132 5450 WholeStageResultIterator.cc:354] Native Plan
with stats for: [Stage: 35 TID: 90]
-- Aggregation[21][PARTIAL n21_0 := sum_merge("n20_1"), n21_1 :=
sum_merge("n20_2"), n21_2 := count_partial("n18_5")] -> n21_0:DOUBLE,
n21_1:DOUBLE, n21_2:BIGINT
Output: 1 rows (24B, 1 batches), Cpu time: 599.29us, Wall time: 2.19ms,
Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1, CPU
breakdown: B/I/O/F (81.71us/3.75us/485.23us/28.60us)
runningAddInputWallNanos sum: 4.60us, count: 1, min: 4.60us, max:
4.60us, avg: 4.60us
runningFinishWallNanos sum: 104.96us, count: 1, min: 104.96us,
max: 104.96us, avg: 104.96us
runningGetOutputWallNanos sum: 1.65ms, count: 1, min: 1.65ms, max:
1.65ms, avg: 1.65ms
-- Aggregation[20][SINGLE [n18_5] n20_1 := sum_merge("n19_1"), n20_2 :=
sum_merge("n19_2")] -> n18_5:BIGINT, n20_1:DOUBLE, n20_2:DOUBLE
Output: 8 rows (256B, 1 batches), Cpu time: 749.37us, Wall time:
2.39ms, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0,
Threads: 1, CPU breakdown: B/I/O/F (81.36us/3.65us/640.41us/23.95us)
queuedWallNanos sum: 2.00us, count: 1, min: 2.00us,
max: 2.00us, avg: 2.00us
runningAddInputWallNanos sum: 4.63us, count: 1, min: 4.63us,
max: 4.63us, avg: 4.63us
runningFinishWallNanos sum: 68.09us, count: 1, min: 68.09us,
max: 68.09us, avg: 68.09us
runningGetOutputWallNanos sum: 2.09ms, count: 1, min: 2.09ms,
max: 2.09ms, avg: 2.09ms
-- Aggregation[19][SINGLE [n18_5] n19_1 := sum_partial("n18_6"), n19_2
:= sum_partial("n18_7")] -> n18_5:BIGINT, n19_1:DOUBLE, n19_2:DOUBLE
Output: 8 rows (256B, 1 batches), Cpu time: 823.88us, Wall time:
2.57ms, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0,
Threads: 1, CPU breakdown: B/I/O/F (56.28us/17.21us/723.50us/26.89us)
```
### Gluten version
None
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]