zabetak commented on code in PR #4237:
URL: https://github.com/apache/hive/pull/4237#discussion_r1169885465


##########
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HivePointLookupOptimizerRule.java:
##########
@@ -669,6 +670,22 @@ private static RexNode handleAND(RexBuilder rexBuilder, 
RexCall call) {
       return RexUtil.composeConjunction(rexBuilder, newOperands, false);
     }
 
+    private static void retainAll(Collection<RexNode> elementsToRetain, 
Collection<RexNode> collection) {
+      collection.removeIf(rexNode -> elementsToRetain.stream().noneMatch(
+              rexNodeToRetain -> equalsWithSimilarType(rexNode, 
rexNodeToRetain)));
+    }
+
+    private static boolean equalsWithSimilarType(RexNode rexNode1, RexNode 
rexNode2) {
+      if (!(rexNode1 instanceof RexLiteral) || !(rexNode2 instanceof 
RexLiteral)) {
+        return rexNode1.equals(rexNode2);
+      }
+
+      RexLiteral rexLiteral1 = (RexLiteral) rexNode1;
+      RexLiteral rexLiteral2 = (RexLiteral) rexNode2;
+      return rexLiteral1.getValue().compareTo(rexLiteral2.getValue()) == 0 &&

Review Comment:
   Can we arrive here with the NULL literal? Will getValue return something or 
fail with NPE?



##########
ql/src/test/org/apache/hadoop/hive/ql/optimizer/calcite/rules/TestHivePointLookupOptimizerRule.java:
##########
@@ -348,4 +356,100 @@ public void testRecursionIsNotObstructed() {
         condition.toString());
   }
 
+  @Test
+  public void testSameVarcharLiteralDifferentPrecision() {
+
+    final RexBuilder rexBuilder = relBuilder.getRexBuilder();
+    RelDataType stringType30 = 
rexBuilder.getTypeFactory().createTypeWithCharsetAndCollation(
+            rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 30),
+            Charset.forName(ConversionUtil.NATIVE_UTF16_CHARSET_NAME), 
SqlCollation.IMPLICIT);
+    RexNode lita30 = 
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("AAA111"), 
stringType30, true);
+    RexNode litb30 = 
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("BBB222"), 
stringType30, true);
+
+    RelDataType stringType14 = 
rexBuilder.getTypeFactory().createTypeWithCharsetAndCollation(
+            rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 14),
+            Charset.forName(ConversionUtil.NATIVE_UTF16_CHARSET_NAME), 
SqlCollation.IMPLICIT);
+    RexNode lita14 = 
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("AAA111"), 
stringType14, true);
+    RexNode litb14 = 
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("BBB222"), 
stringType14, true);
+
+    final RelNode basePlan = relBuilder
+          .scan("t")
+          .filter(and(relBuilder,
+                  relBuilder.call(SqlStdOperatorTable.IN, 
relBuilder.field("f2"), lita30, litb30),
+                  relBuilder.call(SqlStdOperatorTable.IN, 
relBuilder.field("f2"), lita14, litb14)))
+          .build();
+
+    planner.setRoot(basePlan);
+    RelNode optimizedRelNode = planner.findBestExp();
+
+    HiveFilter filter = (HiveFilter) optimizedRelNode;
+    RexNode condition = filter.getCondition();
+    System.out.println(condition);
+    assertEquals("IN($1, " +
+                    "_UTF-16LE'AAA111':VARCHAR(30) CHARACTER SET \"UTF-16LE\", 
" +
+                    "_UTF-16LE'BBB222':VARCHAR(30) CHARACTER SET 
\"UTF-16LE\")",

Review Comment:
   Did you check if the results are inline with `RexSimplify` and 
`ReduceExpressionsRules`?



##########
ql/src/test/queries/clientpositive/pointlookup6.q:
##########
@@ -0,0 +1,19 @@
+set hive.optimize.point.lookup.min=2;
+
+create table r_table (
+  string_col varchar(30)
+);
+
+create table l_table (
+  string_col varchar(14)
+);
+
+insert into r_table VALUES ('AAA111');
+insert into l_table VALUES ('AAA111');
+
+explain cbo
+SELECT l_table.string_col from l_table, r_table
+WHERE r_table.string_col = l_table.string_col AND l_table.string_col IN 
('AAA111', 'BBB222') AND r_table.string_col IN ('AAA111', 'BBB222');

Review Comment:
   If instead of `IN` we have an equivalent query with `OR` do we still hit the 
problem? Do we get the expected plan when the condition is expressed with `OR`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to