[ https://issues.apache.org/jira/browse/PHOENIX-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16356292#comment-16356292 ]
James Taylor commented on PHOENIX-4586: --------------------------------------- +1 on the fix. Thanks for the quick turnaround, [~maryannxue]. > UPSERT SELECT doesn't take in account comparison operators for subqueries. > -------------------------------------------------------------------------- > > Key: PHOENIX-4586 > URL: https://issues.apache.org/jira/browse/PHOENIX-4586 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.14.0 > Reporter: Sergey Soldatov > Assignee: Maryann Xue > Priority: Critical > Fix For: 4.14.0 > > Attachments: PHOENIX-4586.patch > > > If upsert select has a where condition that is using any comparison operator > (including ANY/SOME/etc), the whole WHERE clause just ignored. Table: > {noformat} > create table T (id integer primary key, i1 integer); > upsert into T values (1,1); > upsert into T values (2,2); > {noformat} > Query that should not upsert anything because we have a condition in where > that I1 should be greater than any value we already have as well as not > existing ID: > {noformat} > 0: jdbc:phoenix:> upsert into T select id, 4 from T where id = 3 AND i1 > > (select i1 from T); > 2 rows affected (0.02 seconds) > 0: jdbc:phoenix:> select * from T; > +-----+-----+ > | ID | I1 | > +-----+-----+ > | 1 | 4 | > | 2 | 4 | > +-----+-----+ > 2 rows selected (0.014 seconds) > {noformat} > Now with ANY. Should not upsert anything as well because ID is [1,2], while > I1 are all '4': > {noformat} > 0: jdbc:phoenix:> upsert into T select id, 5 from T where id = 2 AND i1 = ANY > (select ID from T); > 2 rows affected (0.016 seconds) > 0: jdbc:phoenix:> select * from T; > +-----+-----+ > | ID | I1 | > +-----+-----+ > | 1 | 5 | > | 2 | 5 | > +-----+-----+ > 2 rows selected (0.013 seconds) > {noformat} > A similar query with IN works just fine: > {noformat} > 0: jdbc:phoenix:> upsert into T select id, 6 from T where id = 2 AND i1 IN > (select ID from T); > No rows affected (0.094 seconds) > 0: jdbc:phoenix:> select * from T; > +-----+-----+ > | ID | I1 | > +-----+-----+ > | 1 | 5 | > | 2 | 5 | > +-----+-----+ > 2 rows selected (0.014 seconds) > {noformat} > The reason for this behavior is that for IN we convert subselect to semi-join > and execute upsert on the client side. For comparisons, we don't perform any > transformations and query is considered flat and finally executed on the > server side. Not sure why, but we also completely ignore the second > condition in WHERE clause as well and that may lead to a serious data loss. > [~jamestaylor], [~maryannxue] any thoughts or suggestions how to fix that are > really appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)