[ 
https://issues.apache.org/jira/browse/HIVE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828820#comment-15828820
 ] 

Eugene Koifman commented on HIVE-14949:
---------------------------------------

maybe a simper strategy is to create a RaiseErrorForMerge() UDF

insert into tmp_table select RaiseErrorForMerge() where <on clause expr> group 
by target.ROW__ID having count(*) > 1

So we never actually write anything to tmp_table, but if select produces any 
rows at all RaiseErrorForMerge() will throw and kill the query.  This avoids 
any need for post hook check the table

> Enforce that target:source is not 1:N
> -------------------------------------
>
>                 Key: HIVE-14949
>                 URL: https://issues.apache.org/jira/browse/HIVE-14949
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Transactions
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>
> If > 1 row on source side matches the same row on target side that means that 
>  we are forced update (or delete) the same row in target more than once as 
> part of the same SQL statement.  This should raise an error per SQL Spec
> There is no sure way to do this via static analysis of the query.
> Can we add something to ROJ operator to pay attention to ROW__ID of target 
> side row and compare it with ROW__ID of target side of previous row output?  
> If they are the same, that means > 1 source row matched.
> Or perhaps just mark each row in the hash table that it matched.  And if it 
> matches again, throw an error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to