okumin commented on code in PR #5245: URL: https://github.com/apache/hive/pull/5245#discussion_r1597770008
########## ql/src/test/results/clientpositive/llap/cbo_rp_groupby3_noskew_multi_distinct.q.out: ########## @@ -185,3 +185,118 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@dest1_n123 #### A masked pattern was here #### 130091.0 260.182 256.10355987055016 98.0 0.0 142.9268095075238 143.06995106518906 20428.072876000002 20469.010897795593 79136.0 309.0 +PREHOOK: query: CREATE TABLE test (col1 INT, col2 INT) +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@test +POSTHOOK: query: CREATE TABLE test (col1 INT, col2 INT) +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@test +PREHOOK: query: INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300) +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@test +POSTHOOK: query: INSERT INTO test VALUES (1, 100), (2, 200), (2, 200), (3, 300) +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@test +POSTHOOK: Lineage: test.col1 SCRIPT [] +POSTHOOK: Lineage: test.col2 SCRIPT [] +PREHOOK: query: EXPLAIN +SELECT + SUM(DISTINCT col1), + COUNT(DISTINCT col1), + SUM(col2), -- This has to refer to the key for `SUM(DISTINCT col2)` + MAX(DISTINCT col1), + SUM(DISTINCT col2), + MIN(DISTINCT col1) +FROM test +PREHOOK: type: QUERY +PREHOOK: Input: default@test +#### A masked pattern was here #### +POSTHOOK: query: EXPLAIN +SELECT + SUM(DISTINCT col1), + COUNT(DISTINCT col1), + SUM(col2), -- This has to refer to the key for `SUM(DISTINCT col2)` + MAX(DISTINCT col1), + SUM(DISTINCT col2), + MIN(DISTINCT col1) +FROM test +POSTHOOK: type: QUERY +POSTHOOK: Input: default@test +#### A masked pattern was here #### +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Edges: + Reducer 2 <- Map 1 (SIMPLE_EDGE) +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: test + Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE + Select Operator + expressions: col1 (type: int), col2 (type: int) + outputColumnNames: col1, col2 + Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE + Reduce Output Operator + key expressions: col1 (type: int), col2 (type: int) + null sort order: zz + sort order: ++ + Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE + Execution mode: vectorized, llap + LLAP IO: all inputs + Reducer 2 + Execution mode: llap + Reduce Operator Tree: + Group By Operator + aggregations: sum(DISTINCT KEY._col0:0._col0), count(DISTINCT KEY._col0:1._col0), sum(KEY._col0:3._col0), max(DISTINCT KEY._col0:2._col0), sum(DISTINCT KEY._col0:3._col0), min(DISTINCT KEY._col0:4._col0) Review Comment: Without this patch, the third UDAF is not `sum(KEY._col0:3._col0)` but `sum(KEY._col0:1._col0)`, which means it reads a different column. https://github.com/apache/hive/pull/5245/commits/3681775c9e183208fcde5d1d429fb5cf31fe81be#diff-c0641a94a4e5e0c28830fcdccd252506a0dc7842a24aa414bf096a229e8e4235R262 With `set hive.cbo.returnpath.hiveop=false;`, Hive generates the same plan as that in q.out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
