Yohahaha opened a new issue, #5630:
URL: https://github.com/apache/incubator-gluten/issues/5630
### Backend
VL (Velox)
### Bug description
TPCDS sf100 q67
Spark view of highest peak memory is 1.1G
```
+----+-------+-----------------+------------+
|Name|Runtime|TaskPeakMem |StagePeakMem|
+----+-------+-----------------+------------+
|q67 |30.93 |1160.0 MiB (1028)|19.0 GiB (7)|
|sum |30.93 | | |
+----+-------+-----------------+------------+
```
Velox view of highest peak memory is 500+M
```
I20240507 10:54:36.669220 1771770 WholeStageResultIterator.cc:437] Native
Plan with stats for: [Stage: 7 TID: 1028]
-- 9 Project[expressions: (n9_11:INTEGER,
hash_with_seed(42,"n8_10","n8_11","n8_12","n8_13","n8_14","n8_15","n8_16","n8_17","n8_18")),
(n9_12:VARCHAR, "n8_10"), (n9_13:VARCHAR, "n8_11"), (n9_14:VARCHAR, "n8_12"),
(n9_15:VARCHAR, "n8_13"), (n9_16:INTEGER, "n8_14"), (n9_17:INTEGER, "n8_15"),
(n9_18:INTEGER, "n8_16"), (n9_19:VARCHAR, "n8_17"), (n9_20:BIGINT, "n8_18"),
(n9_21:DECIMAL(28, 2), "n8_19"), (n9_22:BOOLEAN, "n8_20")] -> n9_11:INTEGER,
n9_12:VARCHAR, n9_13:VARCHAR, n9_14:VARCHAR, n9_15:VARCHAR, n9_16:INTEGER,
n9_17:INTEGER, n9_18:INTEGER, n9_19:VARCHAR, n9_20:BIGINT, n9_21:DECIMAL(28,
2), n9_22:BOOLEAN
Output: 3279771 rows (827.15MB, 802 batches), Cpu time: 489.84ms, Blocked
wall time: 0ns, Peak memory: 24.00KB, Memory allocations: 802, Threads: 1
queuedWallNanos sum: 466.00us, count: 1, min: 466.00us,
max: 466.00us
runningAddInputWallNanos sum: 197.77us, count: 1, min: 197.77us,
max: 197.77us
runningFinishWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns
runningGetOutputWallNanos sum: 491.61ms, count: 1, min: 491.61ms,
max: 491.61ms
-- 8 Project[expressions: (n8_10:VARCHAR, "n6_13"), (n8_11:VARCHAR,
"n6_14"), (n8_12:VARCHAR, "n6_15"), (n8_13:VARCHAR, "n6_16"), (n8_14:INTEGER,
"n6_17"), (n8_15:INTEGER, "n6_18"), (n8_16:INTEGER, "n6_19"), (n8_17:VARCHAR,
"n6_20"), (n8_18:BIGINT, "n6_21"), (n8_19:DECIMAL(28, 2), "n7_9"["col_0"]),
(n8_20:BOOLEAN, "n7_9"["col_1"])] -> n8_10:VARCHAR, n8_11:VARCHAR,
n8_12:VARCHAR, n8_13:VARCHAR, n8_14:INTEGER, n8_15:INTEGER, n8_16:INTEGER,
n8_17:VARCHAR, n8_18:BIGINT, n8_19:DECIMAL(28, 2), n8_20:BOOLEAN
Output: 3279771 rows (808.46MB, 802 batches), Cpu time: 7.61ms, Blocked
wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1
runningAddInputWallNanos sum: 537.63us, count: 1, min: 537.63us,
max: 537.63us
runningFinishWallNanos sum: 142ns, count: 1, min: 142ns, max:
142ns
runningGetOutputWallNanos sum: 6.51ms, count: 1, min: 6.51ms,
max: 6.51ms
-- 7 Aggregation[PARTIAL [n6_13, n6_14, n6_15, n6_16, n6_17, n6_18,
n6_19, n6_20, n6_21] n7_9 := sum_partial("n6_22")] -> n6_13:VARCHAR,
n6_14:VARCHAR, n6_15:VARCHAR, n6_16:VARCHAR, n6_17:INTEGER, n6_18:INTEGER,
n6_19:INTEGER, n6_20:VARCHAR, n6_21:BIGINT, n7_9:ROW<col_0:DECIMAL(28,
2),col_1:BOOLEAN>
Output: 3279771 rows (808.46MB, 802 batches), Cpu time: 7.17s,
Blocked wall time: 0ns, Peak memory: 503.52MB, Memory allocations: 6608,
Threads: 1
distinctKey0 sum: 11, count: 1,
min: 11, max: 11
distinctKey1 sum: 100, count: 1,
min: 100, max: 100
distinctKey2 sum: 687, count: 1,
min: 687, max: 687
distinctKey3 sum: 8065, count: 1,
min: 8065, max: 8065
distinctKey4 sum: 2, count: 1, min:
2, max: 2
distinctKey5 sum: 5, count: 1, min:
5, max: 5
distinctKey6 sum: 6, count: 1, min:
6, max: 6
distinctKey7 sum: 202, count: 1,
min: 202, max: 202
distinctKey8 sum: 10, count: 1,
min: 10, max: 10
flushRowCount sum: 1377709, count:
1, min: 1377709, max: 1377709
flushTimes sum: 1, count: 1, min:
1, max: 1
hashtable.capacity sum: 4194304, count:
1, min: 4194304, max: 4194304
hashtable.numDistinct sum: 1902062, count:
1, min: 1902062, max: 1902062
hashtable.numRehashes sum: 21, count: 1,
min: 21, max: 21
hashtable.numTombstones sum: 0, count: 1, min:
0, max: 0
maxExtendedPartialAggregationMemoryUsage sum: 537.60MB, count:
1, min: 537.60MB, max: 537.60MB
partialAggregationPct sum: 11, count: 1,
min: 11, max: 11
rangeKey4 sum: 510, count: 1,
min: 510, max: 510
rangeKey5 sum: 767, count: 1,
min: 767, max: 767
rangeKey6 sum: 1790, count: 1,
min: 1790, max: 1790
rangeKey8 sum: 3775, count: 1,
min: 3775, max: 3775
runningAddInputWallNanos sum: 6.46s, count: 1,
min: 6.46s, max: 6.46s
runningFinishWallNanos sum: 851ns, count: 1,
min: 851ns, max: 851ns
runningGetOutputWallNanos sum: 737.58ms, count:
1, min: 737.58ms, max: 737.58ms
-- 6 Project[expressions: (n6_11:INTEGER, "n5_0"), (n6_12:DECIMAL(7,
2), "n5_1"), (n6_13:VARCHAR, "n5_2"), (n6_14:VARCHAR, "n5_3"), (n6_15:VARCHAR,
"n5_4"), (n6_16:VARCHAR, "n5_5"), (n6_17:INTEGER, "n5_6"), (n6_18:INTEGER,
"n5_7"), (n6_19:INTEGER, "n5_8"), (n6_20:VARCHAR, "n5_9"), (n6_21:BIGINT,
"n5_10"), (n6_22:DECIMAL(18, 2), coalesce(multiply("n5_1",try_cast "n5_0" as
DECIMAL(10, 0)),0.00))] -> n6_11:INTEGER, n6_12:DECIMAL(7, 2), n6_13:VARCHAR,
n6_14:VARCHAR, n6_15:VARCHAR, n6_16:VARCHAR, n6_17:INTEGER, n6_18:INTEGER,
n6_19:INTEGER, n6_20:VARCHAR, n6_21:BIGINT, n6_22:DECIMAL(18, 2)
Output: 29551140 rows (4.20GB, 8190 batches), Cpu time: 1.45s,
Blocked wall time: 0ns, Peak memory: 112.75KB, Memory allocations: 40941,
Threads: 1
runningAddInputWallNanos sum: 2.38ms, count: 1, min: 2.38ms,
max: 2.38ms
runningFinishWallNanos sum: 3.31us, count: 1, min: 3.31us,
max: 3.31us
runningGetOutputWallNanos sum: 1.61s, count: 1, min: 1.61s,
max: 1.61s
-- 5 Expand[[n4_12, n4_13, n4_14, n4_15, n4_16, n4_17, n4_18, n4_19,
n4_20, n4_21, 0], [n4_12, n4_13, n4_14, n4_15, n4_16, n4_17, n4_18, n4_19,
n4_20, null, 1], [n4_12, n4_13, n4_14, n4_15, n4_16, n4_17, n4_18, n4_19, null,
null, 3], [n4_12, n4_13, n4_14, n4_15, n4_16, n4_17, n4_18, null, null, null,
7], [n4_12, n4_13, n4_14, n4_15, n4_16, n4_17, null, null, null, null, 15],
[n4_12, n4_13, n4_14, n4_15, n4_16, null, null, null, null, null, 31], [n4_12,
n4_13, n4_14, n4_15, null, null, null, null, null, null, 63], [n4_12, n4_13,
n4_14, null, null, null, null, null, null, null, 127], [n4_12, n4_13, null,
null, null, null, null, null, null, null, 255]] -> n5_0:INTEGER,
n5_1:DECIMAL(7, 2), n5_2:VARCHAR, n5_3:VARCHAR, n5_4:VARCHAR, n5_5:VARCHAR,
n5_6:INTEGER, n5_7:INTEGER, n5_8:INTEGER, n5_9:VARCHAR, n5_10:BIGINT
Output: 29551140 rows (3.91GB, 8190 batches), Cpu time: 474.34ms,
Blocked wall time: 0ns, Peak memory: 1.00KB, Memory allocations: 32760,
Threads: 1
runningAddInputWallNanos sum: 556.97us, count: 1, min:
556.97us, max: 556.97us
runningFinishWallNanos sum: 161ns, count: 1, min: 161ns,
max: 161ns
runningGetOutputWallNanos sum: 580.35ms, count: 1, min:
580.35ms, max: 580.35ms
-- 4 Project[expressions: (n4_12:INTEGER, "n3_13"),
(n4_13:DECIMAL(7, 2), "n3_14"), (n4_14:VARCHAR, "n3_22"), (n4_15:VARCHAR,
"n3_21"), (n4_16:VARCHAR, "n3_20"), (n4_17:VARCHAR, "n3_23"), (n4_18:INTEGER,
"n3_15"), (n4_19:INTEGER, "n3_17"), (n4_20:INTEGER, "n3_16"), (n4_21:VARCHAR,
"n3_18")] -> n4_12:INTEGER, n4_13:DECIMAL(7, 2), n4_14:VARCHAR, n4_15:VARCHAR,
n4_16:VARCHAR, n4_17:VARCHAR, n4_18:INTEGER, n4_19:INTEGER, n4_20:INTEGER,
n4_21:VARCHAR
Output: 3283460 rows (724.46MB, 910 batches), Cpu time: 3.66ms,
Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1
runningAddInputWallNanos sum: 276.29us, count: 1, min:
276.29us, max: 276.29us
runningFinishWallNanos sum: 174ns, count: 1, min:
174ns, max: 174ns
runningGetOutputWallNanos sum: 2.71ms, count: 1, min:
2.71ms, max: 2.71ms
-- 3 Project[expressions: (n3_12:INTEGER, "n0_0"),
(n3_13:INTEGER, "n0_1"), (n3_14:DECIMAL(7, 2), "n0_2"), (n3_15:INTEGER,
"n0_3"), (n3_16:INTEGER, "n0_4"), (n3_17:INTEGER, "n0_5"), (n3_18:VARCHAR,
"n0_6"), (n3_19:INTEGER, "n1_0"), (n3_20:VARCHAR, "n1_1"), (n3_21:VARCHAR,
"n1_2"), (n3_22:VARCHAR, "n1_3"), (n3_23:VARCHAR, "n1_4")] -> n3_12:INTEGER,
n3_13:INTEGER, n3_14:DECIMAL(7, 2), n3_15:INTEGER, n3_16:INTEGER,
n3_17:INTEGER, n3_18:VARCHAR, n3_19:INTEGER, n3_20:VARCHAR, n3_21:VARCHAR,
n3_22:VARCHAR, n3_23:VARCHAR
Output: 3283460 rows (758.23MB, 910 batches), Cpu time:
5.54ms, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0,
Threads: 1
runningAddInputWallNanos sum: 562.53us, count: 1, min:
562.53us, max: 562.53us
runningFinishWallNanos sum: 165ns, count: 1, min:
165ns, max: 165ns
runningGetOutputWallNanos sum: 4.27ms, count: 1, min:
4.27ms, max: 4.27ms
-- 2 HashJoin[INNER n0_0=n1_0] -> n0_0:INTEGER, n0_1:INTEGER,
n0_2:DECIMAL(7, 2), n0_3:INTEGER, n0_4:INTEGER, n0_5:INTEGER, n0_6:VARCHAR,
n1_0:INTEGER, n1_1:VARCHAR, n1_2:VARCHAR, n1_3:VARCHAR, n1_4:VARCHAR
Output: 3283460 rows (758.23MB, 910 batches), Cpu time:
521.65ms, Blocked wall time: 0ns, Peak memory: 6.72MB, Memory allocations: 2785
HashBuild: Input: 12295 rows (1.47MB, 4 batches), Output: 0
rows (0B, 0 batches), Cpu time: 3.88ms, Blocked wall time: 0ns, Peak memory:
6.06MB, Memory allocations: 11, Threads: 1
distinctKey0 sum: 12296, count: 1, min:
12296, max: 12296
hashtable.buildWallNanos sum: 1.24ms, count: 1, min:
1.24ms, max: 1.24ms
hashtable.capacity sum: 203978, count: 1, min:
203978, max: 203978
hashtable.numDistinct sum: 12295, count: 1, min:
12295, max: 12295
hashtable.numRehashes sum: 1, count: 1, min: 1,
max: 1
queuedWallNanos sum: 0ns, count: 1, min:
0ns, max: 0ns
rangeKey0 sum: 203978, count: 1, min:
203978, max: 203978
runningAddInputWallNanos sum: 3.93ms, count: 1, min:
3.93ms, max: 3.93ms
runningFinishWallNanos sum: 0ns, count: 1, min:
0ns, max: 0ns
runningGetOutputWallNanos sum: 1.36us, count: 1, min:
1.36us, max: 1.36us
HashProbe: Input: 3283460 rows (195.08MB, 910 batches),
Output: 3283460 rows (758.23MB, 910 batches), Cpu time: 517.78ms, Blocked wall
time: 0ns, Peak memory: 675.00KB, Memory allocations: 2774, Threads: 1
queuedWallNanos sum: 1.00us, count: 1, min:
1.00us, max: 1.00us
runningAddInputWallNanos sum: 24.62ms, count: 1,
min: 24.62ms, max: 24.62ms
runningFinishWallNanos sum: 546ns, count: 1, min:
546ns, max: 546ns
runningGetOutputWallNanos sum: 502.54ms, count: 1,
min: 502.54ms, max: 502.54ms
-- 0 ValueStream[] -> n0_0:INTEGER, n0_1:INTEGER,
n0_2:DECIMAL(7, 2), n0_3:INTEGER, n0_4:INTEGER, n0_5:INTEGER, n0_6:VARCHAR
Input: 0 rows (0B, 0 batches), Output: 3283460 rows
(195.08MB, 910 batches), Cpu time: 225.96ms, Blocked wall time: 0ns, Peak
memory: 0B, Memory allocations: 0, Threads: 1
runningAddInputWallNanos sum: 0ns, count: 1, min:
0ns, max: 0ns
runningFinishWallNanos sum: 7.93us, count: 1,
min: 7.93us, max: 7.93us
runningGetOutputWallNanos sum: 229.59ms, count: 1,
min: 229.59ms, max: 229.59ms
-- 1 ValueStream[] -> n1_0:INTEGER, n1_1:VARCHAR,
n1_2:VARCHAR, n1_3:VARCHAR, n1_4:VARCHAR
Input: 0 rows (0B, 0 batches), Output: 12295 rows
(1.47MB, 4 batches), Cpu time: 9.33ms, Blocked wall time: 0ns, Peak memory: 0B,
Memory allocations: 0, Threads: 1
runningAddInputWallNanos sum: 0ns, count: 1, min:
0ns, max: 0ns
runningFinishWallNanos sum: 1.43ms, count: 1,
min: 1.43ms, max: 1.43ms
runningGetOutputWallNanos sum: 13.62ms, count: 1,
min: 13.62ms, max: 13.62ms
```
### Spark version
None
### Spark configurations
_No response_
### System information
_No response_
### Relevant logs
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]