zabetak commented on code in PR #6515:
URL: https://github.com/apache/hive/pull/6515#discussion_r3363689405


##########
ql/src/test/queries/clientpositive/pcr_null_partition.q:
##########
@@ -0,0 +1,17 @@
+set hive.exec.dynamic.partition.mode=nonstrict;
+set hive.fetch.task.conversion=none;
+
+drop table if exists pcr_t1;
+create table pcr_t1 (key string, value string) partitioned by (ds string);
+
+insert into pcr_t1 partition (ds) select 'A', 'V1', '2000-04-08';
+insert into pcr_t1 partition (ds) select 'B', 'V2', 'null';
+insert into pcr_t1 partition (ds) select 'C', 'V3', null;
+
+explain select key, value, ds from pcr_t1 where ds is null;

Review Comment:
   Is there something useful in the EXPLAIN output? If not we can drop those.



##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java:
##########
@@ -82,9 +87,15 @@ static public Object evalExprWithPart(ExprNodeDesc expr, 
Partition p) throws Hiv
       partNames.add(entry.getKey());
       ObjectInspector oi = 
PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector
           (TypeInfoFactory.getPrimitiveTypeInfo(partKeyTypes[i++]));
-      partValues.add(ObjectInspectorConverters.getConverter(
-          PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi)
-          .convert(entry.getValue()));
+
+      String partitionValue = entry.getValue();
+      if (partitionValue.equals(defaultPartitionName)) {
+        partValues.add(null); // Null for default partition.
+      } else {
+        partValues.add(ObjectInspectorConverters.getConverter(
+            PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi)
+            .convert(partitionValue));
+      }

Review Comment:
   The description of `hive.exec.default.partition.name` property says the 
following:
   ``` 
   "The default partition name in case the dynamic partition column value is 
null/empty string or any other values that cannot be escaped."
   ```
   This code assumes that the value is null but according to the description it 
may be other things as well.
   
   The change here will fix the `IS [NOT] NULL` predicate but may potentially 
change the behavior of some queries:
   ```sql
   SELECT key, value, ds FROM pcr_t1 WHERE ds > 'A';
   SELECT key, value, ds FROM pcr_t1 WHERE ds < 'A';
   ```
   Do the queries return the same results before/after the changes in the PR 
for Iceberg and non-Iceberg tables ?
   Which behavior should prevail?



##########
ql/src/test/queries/clientpositive/pcr_null_partition.q:
##########
@@ -0,0 +1,17 @@
+set hive.exec.dynamic.partition.mode=nonstrict;
+set hive.fetch.task.conversion=none;
+
+drop table if exists pcr_t1;
+create table pcr_t1 (key string, value string) partitioned by (ds string);
+
+insert into pcr_t1 partition (ds) select 'A', 'V1', '2000-04-08';
+insert into pcr_t1 partition (ds) select 'B', 'V2', 'null';
+insert into pcr_t1 partition (ds) select 'C', 'V3', null;

Review Comment:
   The empty string value is another edge case that may create problems
   ```sql
   insert into pcr_t1 partition (ds) select 'D', 'V4', '';
   ```



##########
ql/src/test/queries/clientpositive/pcr_null_partition.q:
##########
@@ -0,0 +1,17 @@
+set hive.exec.dynamic.partition.mode=nonstrict;
+set hive.fetch.task.conversion=none;
+
+drop table if exists pcr_t1;
+create table pcr_t1 (key string, value string) partitioned by (ds string);
+
+insert into pcr_t1 partition (ds) select 'A', 'V1', '2000-04-08';
+insert into pcr_t1 partition (ds) select 'B', 'V2', 'null';
+insert into pcr_t1 partition (ds) select 'C', 'V3', null;
+
+explain select key, value, ds from pcr_t1 where ds is null;
+select key, value, ds from pcr_t1 where ds is null;
+
+explain select key, value, ds from pcr_t1 where ds is not null;
+select key, value, ds from pcr_t1 where ds is not null order by key;
+
+select key, value, ds from pcr_t1 where ds = 'null';

Review Comment:
   Consider enriching the tests with:
   ```sql
   SELECT key, value, ds FROM pcr_t1 WHERE ds > 'A';
   SELECT key, value, ds FROM pcr_t1 WHERE ds < 'A';
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to