Re: [PR] HIVE-27102: Upgrade Calcite to 1.33.0 and Avatica to 1.23.0 [hive]

via GitHub Tue, 21 Jan 2025 11:59:46 -0800


zabetak commented on code in PR #5196:
URL: https://github.com/apache/hive/pull/5196#discussion_r1922314579



##########
ql/src/test/results/clientpositive/llap/pcs.q.out:
##########
@@ -1286,16 +1293,18 @@ PREHOOK: type: QUERY
 PREHOOK: Input: default@pcs_t1
 PREHOOK: Input: default@pcs_t1@ds=2000-04-08
 PREHOOK: Input: default@pcs_t1@ds=2000-04-09
+PREHOOK: Input: default@pcs_t1@ds=2000-04-10
 #### A masked pattern was here ####
 POSTHOOK: query: explain extended select ds from pcs_t1 where struct(ds, key, 
rand(100)) in (struct('2000-04-08',1,0.2), struct('2000-04-09',2,0.3))
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@pcs_t1
 POSTHOOK: Input: default@pcs_t1@ds=2000-04-08
 POSTHOOK: Input: default@pcs_t1@ds=2000-04-09
+POSTHOOK: Input: default@pcs_t1@ds=2000-04-10
 #### A masked pattern was here ####
 OPTIMIZED SQL: SELECT `ds`
 FROM `default`.`pcs_t1`
-WHERE  (`ds`, `key`, RAND(100)) IN ( ('2000-04-08', 1, 0.2),  ('2000-04-09', 
2, 0.3))
+WHERE (`ds`, `key`, RAND(100)) = ('2000-04-08', 1, 0.2) OR (`ds`, `key`, 
RAND(100)) = ('2000-04-09', 2, 0.3)

Review Comment:
   These changes are problematic cause it seems that partition pruning does not 
kick-in. These tests were added as part of HIVE-11634 so it seems that some 
part of that optimization broke as part of the upgrade. I guess this is related 
to the transformation of IN to OR.



##########
ql/src/test/queries/clientpositive/pointlookup3.q:
##########
@@ -1,4 +1,6 @@
 --! qt:dataset:src
+-- SORT_QUERY_RESULTS
+

Review Comment:
   All of the queries in this file contain an explicit `ORDER BY` clause. 
Adding the `SORT_QUERY_RESULTS` directive cancels the `ORDER BY` clauses and 
makes it impossible to verify if the `ORDER BY` really works or not. 
   
   I see three options:
   * If the ORDER BY clause is not important for the test then we can remove 
all of them and keep the `SORT_QUERY_RESULTS` post-processing step.
   * If the ORDER BY clause is important then we should consider enriching it 
with additional columns if we want to make the result more deterministic.
   * If the results do not change at each execution then we could simply update 
the .q.out file without adding any changes in the .q file.



##########
ql/src/test/results/clientpositive/llap/pcs.q.out:
##########
@@ -1355,6 +1364,40 @@ STAGE PLANS:
               serialization.lib 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
             serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
           
+              input format: org.apache.hadoop.mapred.TextInputFormat
+              output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+              properties:
+                bucketing_version 2
+                column.name.delimiter ,
+                columns key,value
+                columns.comments 
+                columns.types int:string
+#### A masked pattern was here ####
+                name default.pcs_t1
+                partition_columns ds
+                partition_columns.types string
+                serialization.format 1
+                serialization.lib 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+              serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+              name: default.pcs_t1
+            name: default.pcs_t1
+          Partition
+            input format: org.apache.hadoop.mapred.TextInputFormat
+            output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+            partition values:
+              ds 2000-04-10
+            properties:
+              column.name.delimiter ,
+              columns key,value
+              columns.types int:string
+#### A masked pattern was here ####
+              name default.pcs_t1
+              partition_columns ds
+              partition_columns.types string
+              serialization.format 1
+              serialization.lib 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+            serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+          

Review Comment:
   According to the plan we are considering the partition `2000-04-10` which 
was not done before so its problematic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Re: [PR] HIVE-27102: Upgrade Calcite to 1.33.0 and Avatica to 1.23.0 [hive]

Reply via email to