Re: [PR] HIVE-27102: Upgrade Calcite to 1.33.0 and Avatica to 1.23.0 [hive]

via GitHub Thu, 09 Jan 2025 07:18:21 -0800


zabetak commented on code in PR #5196:
URL: https://github.com/apache/hive/pull/5196#discussion_r1908879493



##########
ql/src/test/queries/clientpositive/sharedwork.q:
##########
@@ -20,6 +20,24 @@ create table MY_TABLE_0001_01 (
   col_1 string,
   col_100 string);
 
+explain cbo SELECT
+  Table__323.col_7,
+  CAST(Table__323.col_3 AS DATE) col_3,
+  Table__323.col_20,  
+  Table__1232.col_21 col_21_1232,
+  Table__323.col_1,
+  Table__133.col_22,
+  Table__879.col_21 col_21_879
+  ,Table__133.col_23
+FROM MY_TABLE_0001 Table__323
+LEFT OUTER JOIN MY_TABLE_0003 Table__1232 ON 
(Table__323.col_20=Table__1232.col_24)
+LEFT OUTER JOIN MY_TABLE_0001_00 Table__133 ON 
(Table__323.col_1=Table__133.col_1)
+LEFT OUTER JOIN MY_TABLE_0003 Table__879 ON 
(Table__133.col_22=Table__879.col_24)
+LEFT OUTER JOIN MY_TABLE_0001_01 Table__1215 ON 
(Table__323.col_1=Table__1215.col_1 and Table__1215.col_100 = 210)
+WHERE 1=1
+AND  (cast(Table__323.col_7 AS DOUBLE) IS NOT NULL OR Table__323.col_7 IS NULL)
+AND CAST(Table__323.col_3 AS DATE)  BETWEEN  '2018-07-01'  AND  '2019-01-23'
+AND Table__323.col_20  IN  ('part1','part2','part3');

Review Comment:
   It's unclear what is the purpose of the new EXPLAIN CBO statements. If they 
are just for debugging purposes they should be removed. If they are supposed to 
guard against some regression then they should be moved into another .q file 
with a descriptive name that explains their purpose.



##########
ql/src/test/results/clientpositive/llap/explainuser_4.q.out:
##########
@@ -119,24 +119,24 @@ Stage-0
           PARTITION_ONLY_SHUFFLE [RS_11]
             Group By Operator [GBY_10] (rows=1 width=8)
               Output:["_col0"],aggregations:["count()"]
-              Merge Join Operator [MERGEJOIN_29] (rows=4626 width=8)
+              Merge Join Operator [MERGEJOIN_29] (rows=3449 width=8)
                 Conds:RS_32._col0=RS_35._col0(Inner)
               <-Map 1 [SIMPLE_EDGE] vectorized, llap
                 SHUFFLE [RS_32]
                   PartitionCols:_col0
-                  Select Operator [SEL_31] (rows=3078 width=2)
+                  Select Operator [SEL_31] (rows=2297 width=2)
                     Output:["_col0"]
-                    Filter Operator [FIL_30] (rows=3078 width=2)
-                      predicate:cint BETWEEN 1000000 AND 3000000
+                    Filter Operator [FIL_30] (rows=2297 width=2)
+                      predicate:(cint is not null and cint BETWEEN 1000000 AND 
3000000)
                       TableScan [TS_0] (rows=12288 width=2)
                         
default@alltypesorc,a,Tbl:COMPLETE,Col:COMPLETE,Output:["cint"]
               <-Map 4 [SIMPLE_EDGE] vectorized, llap
                 SHUFFLE [RS_35]
                   PartitionCols:_col0
-                  Select Operator [SEL_34] (rows=2298 width=2)
+                  Select Operator [SEL_34] (rows=1715 width=2)
                     Output:["_col0"]
-                    Filter Operator [FIL_33] (rows=2298 width=8)
-                      predicate:(cint BETWEEN 1000000 AND 3000000 and cbigint 
is not null)
+                    Filter Operator [FIL_33] (rows=1715 width=8)
+                      predicate:(cint is not null and cint BETWEEN 1000000 AND 
3000000 and cbigint is not null)

Review Comment:
   It is interesting that the seemingly redundant `IS NOT NULL` predicate 
appears alongside the `BETWEEN` clause. It’s probably not gonna harm the 
performance much but it would be nice to see if it is something caused by Hive 
changes or due to Calcite upgrade. I assume that it has something to do with 
the SEARCH operator.
   
   @kasakrisz Is this the same redundancy that you have been investigating?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Re: [PR] HIVE-27102: Upgrade Calcite to 1.33.0 and Avatica to 1.23.0 [hive]

Reply via email to