[
https://issues.apache.org/jira/browse/HIVE-27264?focusedWorklogId=858179&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-858179
]
ASF GitHub Bot logged work on HIVE-27264:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 20/Apr/23 10:04
Start Date: 20/Apr/23 10:04
Worklog Time Spent: 10m
Work Description: kasakrisz commented on code in PR #4237:
URL: https://github.com/apache/hive/pull/4237#discussion_r1172371274
##########
ql/src/test/org/apache/hadoop/hive/ql/optimizer/calcite/rules/TestHivePointLookupOptimizerRule.java:
##########
@@ -348,4 +356,100 @@ public void testRecursionIsNotObstructed() {
condition.toString());
}
+ @Test
+ public void testSameVarcharLiteralDifferentPrecision() {
+
+ final RexBuilder rexBuilder = relBuilder.getRexBuilder();
+ RelDataType stringType30 =
rexBuilder.getTypeFactory().createTypeWithCharsetAndCollation(
+ rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 30),
+ Charset.forName(ConversionUtil.NATIVE_UTF16_CHARSET_NAME),
SqlCollation.IMPLICIT);
+ RexNode lita30 =
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("AAA111"),
stringType30, true);
+ RexNode litb30 =
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("BBB222"),
stringType30, true);
+
+ RelDataType stringType14 =
rexBuilder.getTypeFactory().createTypeWithCharsetAndCollation(
+ rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, 14),
+ Charset.forName(ConversionUtil.NATIVE_UTF16_CHARSET_NAME),
SqlCollation.IMPLICIT);
+ RexNode lita14 =
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("AAA111"),
stringType14, true);
+ RexNode litb14 =
rexBuilder.makeLiteral(RexNodeExprFactory.makeHiveUnicodeString("BBB222"),
stringType14, true);
+
+ final RelNode basePlan = relBuilder
+ .scan("t")
+ .filter(and(relBuilder,
+ relBuilder.call(SqlStdOperatorTable.IN,
relBuilder.field("f2"), lita30, litb30),
+ relBuilder.call(SqlStdOperatorTable.IN,
relBuilder.field("f2"), lita14, litb14)))
+ .build();
+
+ planner.setRoot(basePlan);
+ RelNode optimizedRelNode = planner.findBestExp();
+
+ HiveFilter filter = (HiveFilter) optimizedRelNode;
+ RexNode condition = filter.getCondition();
+ System.out.println(condition);
+ assertEquals("IN($1, " +
+ "_UTF-16LE'AAA111':VARCHAR(30) CHARACTER SET \"UTF-16LE\",
" +
+ "_UTF-16LE'BBB222':VARCHAR(30) CHARACTER SET
\"UTF-16LE\")",
Review Comment:
Unfortunately in Calcite 1.25 `RexSimplify` returns the input expression so
it can not recognize literals with same values and type but different precision.
I also tested a similar expression with Calcite 1.33:
```
AND(OR(=($0, _UTF-16LE'AAA111'), =($0, _UTF-16LE'BBB222')), OR(=($0,
_UTF-16LE'AAA111'), =($0, _UTF-16LE'BBB222')))
```
and I got
```
SEARCH($0, Sarg[_UTF-16LE'AAA111':VARCHAR(30) CHARACTER SET "UTF-16LE",
_UTF-16LE'BBB222':VARCHAR(30) CHARACTER SET "UTF-16LE"]:VARCHAR(30) CHARACTER
SET "UTF-16LE")
```
In Calcite 1.33 IN expression with constants is no longer represented by
`RexCall` but `SEARCH` so I had to transform the original expression to `OR`s
but the literals has different precision.
This time the expression was simplified.
Issue Time Tracking
-------------------
Worklog Id: (was: 858179)
Time Spent: 2h 20m (was: 2h 10m)
> Literals in conjunction of two IN expression are considered not equals if
> type precision is different
> -----------------------------------------------------------------------------------------------------
>
> Key: HIVE-27264
> URL: https://issues.apache.org/jira/browse/HIVE-27264
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> {code}
> create table r_table (
> string_col varchar(30)
> );
> create table l_table (
> string_col varchar(14)
> );
> insert into r_table VALUES ('AAA111');
> insert into l_table VALUES ('AAA111');
> SELECT l_table.string_col from l_table, r_table
> WHERE r_table.string_col = l_table.string_col AND l_table.string_col IN
> ('AAA111', 'BBB222') AND r_table.string_col IN ('AAA111', 'BBB222');
> {code}
> Should give one row
> {code}
> AAA111
> {code}
> but it returns empty rs
> Workaround
> {code}
> set hive.optimize.point.lookup=false;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)