RE: question on nullable column in primary key and impact on upsert

Rohit Jain Wed, 03 Feb 2016 11:57:07 -0800

Talking about can of worms, if we start allowing NULLable columns in the
Primary Key we will probably find a number of things might break or not
work correctly, since we make a lot of assumptions about those Primary Key
columns, as Qifan pointed out.




*From:* Dave Birdsall [mailto:[email protected]]
*Sent:* Wednesday, February 3, 2016 11:16 AM
*To:* [email protected]
*Subject:* RE: question on nullable column in primary key and impact on
upsert



Hi Eric,



Seems like this business of adding a nullable column to the primary key for
partitioning purposes creates a lot of headaches. By doing so, it changes
the definition of the primary key for the table. But then it appears in the
UPSERT scenario, you really wanted the old primary key, not the new. I’m
wondering if this is just the first exposure while opening a can of worms.



Another approach might be to broaden STORE BY to allow it to take
non-primary-key columns. This would come at a price in performance though.
Unique key operations would now have to check all partitions, unless the
non-primary-key value was supplied (e.g. through a WHERE clause or a VALUES
clause). But you are going to run into this limitation anyway in your
schema design. When you want the old primary key semantic, you’ll face
similar performance issues.



Dave



*From:* Eric Owhadi [mailto:[email protected]]
*Sent:* Wednesday, February 3, 2016 10:43 AM
*To:* [email protected]
*Subject:* question on nullable column in primary key and impact on upsert



This following question is about checking if adding nullable columns on PK
for the sake of data partitioning can be done without too much impact on
initial PK uniqueness semantic… specially for upsert statements…



If I have a composite primary key with column A and B.

A nullable

B non nullable and unique



I was wondering if the upsert statement would correctly guess the intend
and given:

Null,1 in the table

If I upsert (1,1), it will treat it as an update and not an insert given
the unique constraint on column B



Without the unique constraint on column B, I would expect an insert…



This is important to keep the semantic of PK, while artificially adding
composite column to force data partitioning…



The same question if I have now a PK like:

A nullable

B non nullable

C non nullable



And (B,C) constraint to UNIQUE

Will upsert do the PK semantic as if PK was B,C only?



OK, so instead of asking, I was thinking, let’s try:

And here you go:



>>create table t(a int, b int not null not droppable unique, primary
key(a,b));



--- SQL operation complete.

>>insert into t values(null,1);



--- 1 row(s) inserted.

>>select * from t;



A            B

-----------  -----------



          ?            1



--- 1 row(s) selected.

>>upsert into t values(null,2);



*** ERROR[4099] A NULL operand is not allowed in predicate
(TRAFODION.TPCDSGOOD.T.A = NULL).



*** ERROR[8822] The statement was not prepared.



>>upsert into t values(2,2);



--- 1 row(s) inserted.

>>upsert into t values(1,1);



--- 1 row(s) inserted.

>>select * from t;



A            B

-----------  -----------



          ?            1



--- 1 row(s) selected.

>>insert into t values (2,2);



*** ERROR[8102] The operation is prevented by a unique constraint.



--- 0 row(s) inserted.

>>select * from t;



A            B

-----------  -----------



          ?            1



--- 1 row(s) selected.

>>select * from t where b=2;



--- 0 row(s) selected.

>>insert into t values (3,3);



--- 1 row(s) inserted.

>>select * from t;



A            B

-----------  -----------



          ?            1

          3            3



--- 2 row(s) selected.



So 2 questions after this test:

-          Shouldn’t upsert try to use special null semantic instead of
failing on predicate evaluation?

-          Look like upsert of 2,2 succeeded, but next select * does not
show it… however, trying to insert (2,2) fails due to unique key
constraint… so look like the upsert worked half way…

-          Why would the second upsert says there was an insert, but the
next select * statement shows that neither insert or update was performed?



I guess I should file a JIRA… but what is the expected behavior?



Eric

RE: question on nullable column in primary key and impact on upsert

Reply via email to