zabetak commented on code in PR #5781:
URL: https://github.com/apache/hive/pull/5781#discussion_r2073771610
##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -361,8 +361,16 @@ private DestClausePrefix getMergeDestClausePrefix(ASTNode
curNode) {
assert insert != null && insert.getType() == HiveParser.TOK_INSERT;
ASTNode query = (ASTNode) insert.getParent();
assert query != null && query.getType() == HiveParser.TOK_QUERY;
-
- int tokFromIdx =
query.getFirstChildWithType(HiveParser.TOK_FROM).getChildIndex();
+ ASTNode from = (ASTNode) query.getFirstChildWithType(HiveParser.TOK_FROM);
+
+ if (from == null) {
+ // We are here when TOK_FROM is missing from the AST.
+ // This can happen for merge queries with a predicate like
`<joining_column> is null`
+ // in the matched clause.
+ return DestClausePrefix.MERGE;
+ }
Review Comment:
I understand that the FROM clause may be missing but why do we have to
return `DestClausePrefix.MERGE` and not one of the other constants?
##########
iceberg/iceberg-handler/src/test/queries/positive/merge_with_null_check_on_joining_col.q:
##########
@@ -0,0 +1,24 @@
+
+create table target(a int, b int, c int) stored by iceberg
tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write');
+create table source(a int, b int, c int) stored by iceberg
tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write');
+
+-- empty plan as joining column cannot be null for matched clause
+explain cbo
+merge into target as t using source as s on t.a = s.a and t.b = s.b
+when matched and t.a is null then delete;
+
+explain cbo
+merge into target as t using source as s on t.a = s.a and t.b = s.b
+when matched and t.a is null then update set b = t.b + 10;
Review Comment:
Does it make sense to add tests where the merge statement has multiple
branches (delete, update, insert) and only one of them is false each time?
##########
iceberg/iceberg-handler/src/test/queries/positive/merge_with_null_check_on_joining_col.q:
##########
@@ -0,0 +1,24 @@
+
+create table target(a int, b int, c int) stored by iceberg
tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write');
+create table source(a int, b int, c int) stored by iceberg
tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write');
+
+-- empty plan as joining column cannot be null for matched clause
+explain cbo
+merge into target as t using source as s on t.a = s.a and t.b = s.b
+when matched and t.a is null then delete;
+
+explain cbo
+merge into target as t using source as s on t.a = s.a and t.b = s.b
+when matched and t.a is null then update set b = t.b + 10;
+--------------------------------------------------------------------
+
+ -- non empty plans for these queries
+explain cbo
+merge into target as t using source as s on t.a = s.a and t.b = s.b
+when not matched and t.a is null then insert values (1, 2, 3);
+
+explain cbo
+merge into target as t using source as s on t.a = s.a and t.b = s.b
+when matched and t.a is null then delete
+when matched then update set b = t.b + 10
+when not matched then insert values (1, 2, 3);
Review Comment:
I assume we already have tests with MERGE statements and non-empty branches
so if the above queries do not add new coverage then I would suggest to remove
them.
##########
iceberg/iceberg-handler/src/test/queries/positive/merge_with_null_check_on_joining_col.q:
##########
@@ -0,0 +1,24 @@
+
+create table target(a int, b int, c int) stored by iceberg
tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write');
+create table source(a int, b int, c int) stored by iceberg
tblproperties('format-version'='2', 'write.merge.mode'='copy-on-write');
Review Comment:
It seems that the problem is not specific to iceberg but affects all kinds
of merge statements. If that's the case it may be good to write the test case
using regular ACID tables and the more common `TestMiniLlapLocalCliDriver`.
##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -361,8 +361,16 @@ private DestClausePrefix getMergeDestClausePrefix(ASTNode
curNode) {
assert insert != null && insert.getType() == HiveParser.TOK_INSERT;
ASTNode query = (ASTNode) insert.getParent();
assert query != null && query.getType() == HiveParser.TOK_QUERY;
-
- int tokFromIdx =
query.getFirstChildWithType(HiveParser.TOK_FROM).getChildIndex();
+ ASTNode from = (ASTNode) query.getFirstChildWithType(HiveParser.TOK_FROM);
Review Comment:
According to the comment a few line above the code here assumes that the AST
has a certain shape. I am not sure to what extend other parts of the code rely
on this structure. Have you checked if there is a more intuitive fix when going
from CBO to AST?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]