kasakrisz commented on code in PR #5781:
URL: https://github.com/apache/hive/pull/5781#discussion_r2075006717
##########
ql/src/java/org/apache/hadoop/hive/ql/Context.java:
##########
@@ -361,8 +361,16 @@ private DestClausePrefix getMergeDestClausePrefix(ASTNode
curNode) {
assert insert != null && insert.getType() == HiveParser.TOK_INSERT;
ASTNode query = (ASTNode) insert.getParent();
assert query != null && query.getType() == HiveParser.TOK_QUERY;
-
- int tokFromIdx =
query.getFirstChildWithType(HiveParser.TOK_FROM).getChildIndex();
+ ASTNode from = (ASTNode) query.getFirstChildWithType(HiveParser.TOK_FROM);
+
+ if (from == null) {
+ // We are here when TOK_FROM is missing from the AST.
+ // This can happen for merge queries with a predicate like
`<joining_column> is null`
+ // in the matched clause.
+ return DestClausePrefix.MERGE;
+ }
Review Comment:
In case of `merge on read` the merge statement is rewritten to a multi
insert where each insert branch represents a when matched clause and the when
not matched clause. We use the `DestClausePrefix`-es `INSERT`, `UPDATE`,
`DELETE` accordingly to tell the `FileSinkOperator` and the underlying
`RecordWriter` (ex. `OrcRecordWriter`) which type of delta is written.
Copy on write works differently: AFAIK the merge is rewritten to an insert
overwrite statement with a select clause. The select clause is a `union all`
query where each union represents a when matched clause and the when not
matched clause of the merge. In order to distinguish merge and insert overwrite
we needed a new `DestClausePrefix`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]