[
https://issues.apache.org/jira/browse/HIVE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828820#comment-15828820
]
Eugene Koifman commented on HIVE-14949:
---------------------------------------
maybe a simper strategy is to create a RaiseErrorForMerge() UDF
insert into tmp_table select RaiseErrorForMerge() where <on clause expr> group
by target.ROW__ID having count(*) > 1
So we never actually write anything to tmp_table, but if select produces any
rows at all RaiseErrorForMerge() will throw and kill the query. This avoids
any need for post hook check the table
> Enforce that target:source is not 1:N
> -------------------------------------
>
> Key: HIVE-14949
> URL: https://issues.apache.org/jira/browse/HIVE-14949
> Project: Hive
> Issue Type: Sub-task
> Components: Transactions
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
>
> If > 1 row on source side matches the same row on target side that means that
> we are forced update (or delete) the same row in target more than once as
> part of the same SQL statement. This should raise an error per SQL Spec
> There is no sure way to do this via static analysis of the query.
> Can we add something to ROJ operator to pay attention to ROW__ID of target
> side row and compare it with ROW__ID of target side of previous row output?
> If they are the same, that means > 1 source row matched.
> Or perhaps just mark each row in the hash table that it matched. And if it
> matches again, throw an error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)