NEUpanning opened a new issue, #10149: URL: https://github.com/apache/incubator-gluten/issues/10149
### Backend VL (Velox) ### Bug description The metrics of native scan op is totally incorrect. <img width="422" height="506" alt="Image" src="https://github.com/user-attachments/assets/c9899d28-f24a-434e-a95e-d3dbbd56d59a" /> There are two project op outside NestedLoopJoin in velox plan ``` -- Project[6][expressions: (n6_18:INTEGER, hash_with_seed(42,"n5_18")), (n6_19:BIGINT, "n5_18"), (n6_20:VARCHAR, "n5_19"), (n6_21:VARCHAR, "n5_20"), (n6_22:VARCHAR, "n5_21"), (n6_23:VARCHAR, "n5_22"), (n6_24:BIGINT, "n5_23"), (n6_25:VARCHAR, "n5_24"), (n6_26:BIGINT, "n5_25"), (n6_27:VARCHAR, "n5_26"), (n6_28:BIGINT, "n5_27"), (n6_29:BIGINT, "n5_28"), (n6_30:VARCHAR, "n5_29"), (n6_31:VARCHAR, "n5_30"), (n6_32:VARCHAR, "n5_31"), (n6_33:INTEGER, "n5_32"), (n6_34:BIGINT, "n5_34"), (n6_35:VARCHAR, "n5_35"), (n6_36:VARCHAR, "n5_33")] -> n6_18:INTEGER, n6_19:BIGINT, n6_20:VARCHAR, n6_21:VARCHAR, n6_22:VARCHAR, n6_23:VARCHAR, n6_24:BIGINT, n6_25:VARCHAR, n6_26:BIGINT, n6_27:VARCHAR, n6_28:BIGINT, n6_29:BIGINT, n6_30:VARCHAR, n6_31:VARCHAR, n6_32:VARCHAR, n6_33:INTEGER, n6_34:BIGINT, n6_35:VARCHAR, n6_36:VARCHAR Output: 28 rows (13.60KB, 20 batches), Cpu time: 1.13ms, Wall time: 1.24ms, Blocked wall time: 0ns, Peak memory: 48.00KB, Memory allocations: 20, Threads: 1, CPU breakdown: B/I/O/F (42.43us/15.49us/1.06ms/13.84us) queuedWallNanos sum: 32.00us, count: 1, min: 32.00us, max: 32.00us runningAddInputWallNanos sum: 22.69us, count: 1, min: 22.69us, max: 22.69us runningFinishWallNanos sum: 23.69us, count: 1, min: 23.69us, max: 23.69us runningGetOutputWallNanos sum: 1.13ms, count: 1, min: 1.13ms, max: 1.13ms -- Project[5][expressions: (n5_18:BIGINT, "n2_16"), (n5_19:VARCHAR, "n2_17"), (n5_20:VARCHAR, "n2_18"), (n5_21:VARCHAR, "n2_19"), (n5_22:VARCHAR, "n2_20"), (n5_23:BIGINT, "n2_21"), (n5_24:VARCHAR, "n2_22"), (n5_25:BIGINT, "n2_23"), (n5_26:VARCHAR, "n2_24"), (n5_27:BIGINT, "n2_25"), (n5_28:BIGINT, "n2_26"), (n5_29:VARCHAR, "n2_27"), (n5_30:VARCHAR, "n2_28"), (n5_31:VARCHAR, "n2_29"), (n5_32:INTEGER, "n2_30"), (n5_33:VARCHAR, "n3_0"), (n5_34:BIGINT, "n3_1"), (n5_35:VARCHAR, "n3_2")] -> n5_18:BIGINT, n5_19:VARCHAR, n5_20:VARCHAR, n5_21:VARCHAR, n5_22:VARCHAR, n5_23:BIGINT, n5_24:VARCHAR, n5_25:BIGINT, n5_26:VARCHAR, n5_27:BIGINT, n5_28:BIGINT, n5_29:VARCHAR, n5_30:VARCHAR, n5_31:VARCHAR, n5_32:INTEGER, n5_33:VARCHAR, n5_34:BIGINT, n5_35:VARCHAR Output: 28 rows (13.49KB, 20 batches), Cpu time: 259.82us, Wall time: 349.98us, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1, CPU breakdown: B/I/O/F (63.15us/24.91us/156.76us/15.00us) runningAddInputWallNanos sum: 32.16us, count: 1, min: 32.16us, max: 32.16us runningFinishWallNanos sum: 26.82us, count: 1, min: 26.82us, max: 26.82us runningGetOutputWallNanos sum: 181.06us, count: 1, min: 181.06us, max: 181.06us -- NestedLoopJoin[4][INNER] -> n2_16:BIGINT, n2_17:VARCHAR, n2_18:VARCHAR, n2_19:VARCHAR, n2_20:VARCHAR, n2_21:BIGINT, n2_22:VARCHAR, n2_23:BIGINT, n2_24:VARCHAR, n2_25:BIGINT, n2_26:BIGINT, n2_27:VARCHAR, n2_28:VARCHAR, n2_29:VARCHAR, n2_30:INTEGER, n3_0:VARCHAR, n3_1:BIGINT, n3_2:VARCHAR Output: 28 rows (13.49KB, 20 batches), Cpu time: 659.45us, Wall time: 3.42ms, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, CPU breakdown: B/I/O/F (84.39us/50.14us/510.53us/14.39us) NestedLoopJoinBuild: Input: 1 rows (160B, 1 batches), Output: 0 rows (0B, 0 batches), Cpu time: 9.71us, Wall time: 13.85us, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1, CPU breakdown: B/I/O/F (2.08us/3.94us/1.93us/1.76us) queuedWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns runningAddInputWallNanos sum: 4.41us, count: 1, min: 4.41us, max: 4.41us runningFinishWallNanos sum: 3.01us, count: 1, min: 3.01us, max: 3.01us runningGetOutputWallNanos sum: 3.13us, count: 1, min: 3.13us, max: 3.13us NestedLoopJoinProbe: Input: 28 rows (12.70KB, 20 batches), Output: 28 rows (13.49KB, 20 batches), Cpu time: 649.75us, Wall time: 3.41ms, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1, CPU breakdown: B/I/O/F (82.32us/46.20us/508.60us/12.63us) queuedWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns runningAddInputWallNanos sum: 101.48us, count: 1, min: 101.48us, max: 101.48us runningFinishWallNanos sum: 21.18us, count: 1, min: 21.18us, max: 21.18us runningGetOutputWallNanos sum: 3.16ms, count: 1, min: 3.16ms, max: 3.16ms -- Project[2][expressions: (n2_16:BIGINT, "n0_12"), (n2_17:VARCHAR, "n0_8"), (n2_18:VARCHAR, "n0_13"), (n2_19:VARCHAR, "n0_0"), (n2_20:VARCHAR, "n0_4"), (n2_21:BIGINT, "n0_3"), (n2_22:VARCHAR, "n0_15"), (n2_23:BIGINT, "n0_14"), (n2_24:VARCHAR, "n0_7"), (n2_25:BIGINT, "n0_6"), (n2_26:BIGINT, "n0_9"), (n2_27:VARCHAR, "n0_10"), (n2_28:VARCHAR, "n0_11"), (n2_29:VARCHAR, from_unixtime(try_cast divide(try_cast "n0_5" as DOUBLE,1000) as BIGINT,"yyyy-MM-dd")), (n2_30:INTEGER, "n0_1")] -> n2_16:BIGINT, n2_17:VARCHAR, n2_18:VARCHAR, n2_19:VARCHAR, n2_20:VARCHAR, n2_21:BIGINT, n2_22:VARCHAR, n2_23:BIGINT, n2_24:VARCHAR, n2_25:BIGINT, n2_26:BIGINT, n2_27:VARCHAR, n2_28:VARCHAR, n2_29:VARCHAR, n2_30:INTEGER Output: 28 rows (12.70KB, 20 batches), Cpu time: 6.93ms, Wall time: 7.32ms, Blocked wall time: 0ns, Peak memory: 505.50KB, Memory allocations: 102, Threads: 1, CPU breakdown: B/I/O/F (147.82us/151.66us/6.61ms/19.85us) runningAddInputWallNanos sum: 173.38us, count: 1, min: 173.38us, max: 173.38us runningFinishWallNanos sum: 33.82us, count: 1, min: 33.82us, max: 33.82us runningGetOutputWallNanos sum: 6.91ms, count: 1, min: 6.91ms, max: 6.91ms -- Filter[1][expression: and(and(and(isnotnull("n0_11"),isnotnull("n0_2")),equalto("n0_11","重新打开")),equalto("n0_2",1))] -> n0_0:VARCHAR, n0_1:INTEGER, n0_2:INTEGER, n0_3:BIGINT, n0_4:VARCHAR, n0_5:BIGINT, n0_6:BIGINT, n0_7:VARCHAR, n0_8:VARCHAR, n0_9:BIGINT, n0_10:VARCHAR, n0_11:VARCHAR, n0_12:BIGINT, n0_13:VARCHAR, n0_14:BIGINT, n0_15:VARCHAR Output: 0 rows (0B, 0 batches), Cpu time: 0ns, Wall time: 0ns, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, CPU breakdown: B/I/O/F (0ns/0ns/0ns/0ns) -- TableScan[0][table: hive_table, data columns: ROW<id:BIGINT,ticket_id:BIGINT,name:VARCHAR,description:VARCHAR,ticket_type:VARCHAR,category_name:VARCHAR,category_id:BIGINT,type_name:VARCHAR,type_id:BIGINT,item_name:VARCHAR,item_id:BIGINT,assigned:VARCHAR,cc_users:VARCHAR,reporter:VARCHAR,sla:VARCHAR,state:VARCHAR,key_word:VARCHAR,resolution:VARCHAR,closed_desc:VARCHAR,closed_reason:VARCHAR,transfer_reason:VARCHAR,rg_id:BIGINT,updated_at:BIGINT,updated_by:VARCHAR,created_at:BIGINT,created_by:VARCHAR,closed_at:BIGINT,workflow_id:BIGINT,resolved_at:BIGINT,resolved_by:VARCHAR,break_response:INTEGER,break_resolve:INTEGER,break_response_at:BIGINT,break_resolve_at:BIGINT,response_at:BIGINT,response_by:VARCHAR,permission:VARCHAR,tag:VARCHAR,source:VARCHAR,source_id:BIGINT,city:VARCHAR,org:VARCHAR,location:VARCHAR,custom_form_id:BIGINT,custom_field_contents:VARCHAR,archive_id:BIGINT,reopen_reason:VARCHAR,assigned_at:BIGINT,reporter_city:VARCHAR,_update_timestamp:VARCHAR>] -> n0 _0:VARCHAR, n0_1:INTEGER, n0_2:INTEGER, n0_3:BIGINT, n0_4:VARCHAR, n0_5:BIGINT, n0_6:BIGINT, n0_7:VARCHAR, n0_8:VARCHAR, n0_9:BIGINT, n0_10:VARCHAR, n0_11:VARCHAR, n0_12:BIGINT, n0_13:VARCHAR, n0_14:BIGINT, n0_15:VARCHAR Input: 150882 rows (62.75MB, 37 batches), Output: 150882 rows (62.75MB, 37 batches), Cpu time: 2.04s, Wall time: 7.89s, Blocked wall time: 0ns, Peak memory: 41.22MB, Memory allocations: 750, Threads: 1, Splits: 1, CPU breakdown: B/I/O/F (57.31us/0ns/2.04s/1.57us) dataSourceAddSplitWallNanos sum: 1.81s, count: 1, min: 1.81s, max: 1.81s dataSourceReadWallNanos sum: 6.08s, count: 1, min: 6.08s, max: 6.08s ioWaitWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns maxSingleIoWaitWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns numPrefetch sum: 0, count: 1, min: 0, max: 0 overreadBytes sum: 0B, count: 1, min: 0B, max: 0B prefetchBytes sum: 0B, count: 1, min: 0B, max: 0B runningAddInputWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns runningFinishWallNanos sum: 2.35us, count: 1, min: 2.35us, max: 2.35us runningGetOutputWallNanos sum: 7.89s, count: 1, min: 7.89s, max: 7.89s totalRemainingFilterTime sum: 0ns, count: 1, min: 0ns, max: 0ns totalScanTime sum: 0ns, count: 1, min: 0ns, max: 0ns -- ValueStream[3][] -> n3_0:VARCHAR, n3_1:BIGINT, n3_2:VARCHAR Input: 0 rows (0B, 0 batches), Output: 1 rows (160B, 1 batches), Cpu time: 639.40us, Wall time: 650.26us, Blocked wall time: 0ns, Peak memory: 0B, Memory allocations: 0, Threads: 1, CPU breakdown: B/I/O/F (3.02us/0ns/610.17us/26.21us) runningAddInputWallNanos sum: 0ns, count: 1, min: 0ns, max: 0ns runningFinishWallNanos sum: 27.06us, count: 1, min: 27.06us, max: 27.06us runningGetOutputWallNanos sum: 618.25us, count: 1, min: 618.25us, max: 618.25us ``` But there is only one project op outside of NestedLoopJoin in gluten plan. ``` : : :- ^(26) ProjectExecTransformer [ticket_id#190L, ticket_name#102, ticket_type#193, assigned#200, category_name#194, category_id#195L, type_name#196, type_id#197L, item_name#198, item_id#199L, sla#203, state#204, created_day#103, break_response#219, break_resolve#220, timeLabel#242, week_end#240] : : : +- ^(26) VeloxBroadcastNestedLoopJoinExecTransformer BuildRight, Inner : : : :- ^(26) ProjectExecTransformer [ticket_id#190L, name#191 AS ticket_name#102, ticket_type#193, assigned#200, category_name#194, category_id#195L, type_name#196, type_id#197L, item_name#198, item_id#199L, sla#203, state#204, from_unixtime(cast((cast(created_at#213L as double) / 1000.0) as bigint), yyyy-MM-dd, Some(Asia/Shanghai)) AS created_day#103, break_response#219, break_resolve#220] : : : : +- ^(26) FilterExecTransformer (((isnotnull(state#204) AND isnotnull(break_response#219)) AND (state#204 = 重新打开)) AND (break_response#219 = 0)) : : : : +- ^(26) NativeScan hive origindb.cti_tt_ticket__ticket [assigned#200, break_resolve#220, break_response#219, category_id#195L, category_name#194, created_at#213L, item_id#199L, item_name#198, name#191, sla#203, state#204, ticket_id#190L, ticket_type#193, type_id#197L, type_name#196], HiveTableRelation [`origindb`.`cti_tt_ticket__ticket`, com.meituan.hidi.hive.HidiSerde, Data Cols: [id#189L, ticket_id#190L, name#191, description#192, ticket_type#193, category_name#194, category..., Partition Cols: []] : : : +- ^(26) InputIteratorTransformer[week_end#240, timeLabel#242] ``` Through adding some logs, I notice gluten incorrectly updates NativeScan transformer metrics with FilterExecTransformer metrics, because the number of project op is not match between gluten plan and velox plan. ### Gluten version Gluten-1.3 ### Spark version Spark-3.5.x ### Spark configurations _No response_ ### System information _No response_ ### Relevant logs ```bash ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
