Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 12:59 PM, Peter Geoghegan  wrote:
> Maybe we should change the ordering of those IndexInfo structs to
> something more suitable, but it must be immutable (it cannot hinge
> upon the details of one particular DML statement).

I meant that it must be stable (not immutable), in the specific sense
that it cannot change without an ALTER TABLE or similar. The order of
insertion should be consistent among all backends that might be
performing DML against the table at any given time, since, in general,
to do otherwise risks deadlocking (perhaps only with hash indexes, or
a much earlier version of the nbtree am that still used heavyweight
locks).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 12:16 PM, Tom Lane  wrote:
> I'm not following.  The only insertions happening in this test case
> are coming from various clients doing the same INSERT ON CONFLICT UPDATE.
> If one of them has successfully inserted "42" into the arbiter index,
> how is it that other ones are not seeing that?

It's complicated. One backend may completely overtake another in this
race. We don't give up early within ExecInsertIndexTuples(), because
there isn't any useful distinction made between the speculative
insertion case, and the deferred constraint case. The ordering, as
such, is therefore irrelevant.

Fortunately, I posted a fix, of sorts, more than a year ago:

https://commitfest.postgresql.org/10/403/

It never occurred to me to make this argument for it.

There is a separate question of how to make the ordering avoid
problems if it did matter (if that patch ever got committed). I think
it would fix this exact test case, but only accidentally, because the
executor gets a list of IndexInfo structs from the relcache in a
consistent though fairly insignificant order (ordered by OID). I don't
think you can change that property within the relcache, because the
ordering must be consistent (to avoid deadlocks, perhaps elsewhere).

Maybe we should change the ordering of those IndexInfo structs to
something more suitable, but it must be immutable (it cannot hinge
upon the details of one particular DML statement). I actually also
wrote a patch to prefer insertion into the primary key first, which
also went nowhere (I gave up on that one, to be fair).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Konstantin Knizhnik

On 08/25/2016 10:08 PM, Peter Geoghegan wrote:

On Thu, Aug 25, 2016 at 11:49 AM, Tom Lane  wrote:

I think the point is that given the way he's set up the test case,
there should be no duplicate violation in the plain unique index
unless there is one in the arbiter index.  So assuming that INSERT
tests the arbiter indexes first, there shouldn't be an error.
Maybe it doesn't do that, but it seems like it would be a good idea
if it did.

Oh, yeah. This is arguably an example of inference failing to infer
multiple unique indexes as arbiters. Inference could, in principle,
recognize that the second unique index is equivalent to the first, but
doesn't. (I don't think that it matters which order anything is tested
in, though, because not finding a dup value in the arbiter index does
not guarantee that there won't be one in the other index. There is no
locking when no conflict is initially found, and so no guarantees
here.)

Anyway, I don't have a lot of sympathy for this point of view, because
the scenario is completely contrived. You have to draw the line
somewhere.


I do not think that this scenario is completely contrived: the cases when a 
table has more than one primary key are quite common.
For example, "user" may have unique e-mail address, phone number and login.
Also, as far as I know, this is not an artificial example, but real case taken 
from customers application...

I am not sure weather it's really bug or feature, but the user's intention was 
obvious: locate record by one of the unique keys and if such record already 
exists,
then increment counter (do update instead of insert).  But there are also good 
arguments why upsert  may report conflict in this case...

If such UPSERT behavior is assumed to be correct, what is the best workaround 
for the problem if we really need to have to separate indexes and want to 
enforce unique constraint for both keys?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Tom Lane
Peter Geoghegan  writes:
> (I don't think that it matters which order anything is tested
> in, though, because not finding a dup value in the arbiter index does
> not guarantee that there won't be one in the other index. There is no
> locking when no conflict is initially found, and so no guarantees
> here.)

I'm not following.  The only insertions happening in this test case
are coming from various clients doing the same INSERT ON CONFLICT UPDATE.
If one of them has successfully inserted "42" into the arbiter index,
how is it that other ones are not seeing that?

> Anyway, I don't have a lot of sympathy for this point of view, because
> the scenario is completely contrived.

Well, I agree this particular test case looks contrived, but it might be
a simplification of a more convincing real-world case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 11:49 AM, Tom Lane  wrote:
> I think the point is that given the way he's set up the test case,
> there should be no duplicate violation in the plain unique index
> unless there is one in the arbiter index.  So assuming that INSERT
> tests the arbiter indexes first, there shouldn't be an error.
> Maybe it doesn't do that, but it seems like it would be a good idea
> if it did.

Oh, yeah. This is arguably an example of inference failing to infer
multiple unique indexes as arbiters. Inference could, in principle,
recognize that the second unique index is equivalent to the first, but
doesn't. (I don't think that it matters which order anything is tested
in, though, because not finding a dup value in the arbiter index does
not guarantee that there won't be one in the other index. There is no
locking when no conflict is initially found, and so no guarantees
here.)

Anyway, I don't have a lot of sympathy for this point of view, because
the scenario is completely contrived. You have to draw the line
somewhere.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Tom Lane
Peter Geoghegan  writes:
> On Thu, Aug 25, 2016 at 7:12 AM, Ivan Frolkov  wrote:
>> So, if we have primary key and unique constraint on a table then upsert will
>> not work as would expected.

> Why is this unexpected?

> You only take the alternative path (UPDATE) in the event of a would-be
> duplicate violation. You can't upsert while using more than one index
> as an arbiter index. This is true unless they're more or less
> equivalent, in which case multiple arbiter indexes can be inferred,
> but that clearly doesn't apply here.

I think the point is that given the way he's set up the test case,
there should be no duplicate violation in the plain unique index
unless there is one in the arbiter index.  So assuming that INSERT
tests the arbiter indexes first, there shouldn't be an error.
Maybe it doesn't do that, but it seems like it would be a good idea
if it did.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPSERT strange behavior

2016-08-25 Thread Peter Geoghegan
On Thu, Aug 25, 2016 at 7:12 AM, Ivan Frolkov  wrote:
> So, if we have primary key and unique constraint on a table then upsert will
> not work as would expected.

Why is this unexpected?

You only take the alternative path (UPDATE) in the event of a would-be
duplicate violation. You can't upsert while using more than one index
as an arbiter index. This is true unless they're more or less
equivalent, in which case multiple arbiter indexes can be inferred,
but that clearly doesn't apply here.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] UPSERT strange behavior

2016-08-25 Thread Ivan Frolkov

Suppose we have some table

create table cnt( 
 usr_id int primary key, 
 usr_doc_ref text not null, 
 cnt int, 
 sum int 
);

And going to run some insert on conflict update on it (pgbench script):

\setrandom id 1 50 
insert into cnt as c(usr_id,usr_doc_ref, cnt) values(:id, '#'||:id, 1) on 
conflict(usr_id) do update set cnt=c.cnt+1; 

Run it:

 pgbench -c 16 -j 2 -t 5 -n -h localhost -p 5432 -U postgres -f 
upsert2-ok.pgb  work 
transaction type: Custom query 
scaling factor: 1 
query mode: simple 
number of clients: 16 
number of threads: 2 
number of transactions per client: 5 
number of transactions actually processed: 80/80 
latency average: 0.000 ms 
tps = 36475.782816 (including connections establishing) 
tps = 36483.759765 (excluding connections establishing) 

All ok.
Then add a unique constraint to the table.

alter table cnt add constraint usr_doc_ref_uq unique(usr_doc_ref) 

Run pgbench again:

pgbench -c 16 -j 2 -t 5 -n -h localhost -p 5432 -U postgres -f 
upsert2-ok.pgb work
client 2 aborted in state 2: ERROR: duplicate key value violates unique 
constraint "usr_doc_ref_uq"
DETAIL: Key (usr_doc_ref)=(#39) already exists.
client 6 aborted in state 2: ERROR: duplicate key value violates unique 
constraint "usr_doc_ref_uq"
DETAIL: Key (usr_doc_ref)=(#16) already exists.
client 9 aborted in state 2: ERROR: duplicate key value violates unique 
constraint "usr_doc_ref_uq"
DETAIL: Key (usr_doc_ref)=(#28) already exists.

So, if we have primary key and unique constraint on a table then upsert will 
not work as would expected.