On Tue, Mar 17, 2015 at 12:11 PM, Heikki Linnakangas <hlinn...@iki.fi> wrote:
> I'm still not sure the way the speculative locking works is the best
> approach. Instead of clearing xmin on super-deletion, a new flag on the heap
> tuple seems more straightforward. And we could put the speculative insertion
> token in t_ctid, instead of stashing it in the PGPROC array. That would
> again seem more straightforward.

I see the appeal of that approach. What concerns me about that
approach is that it makes it the problem of the inserter to confirm
its insertion, even though overwhelmingly, speculative insertions
succeeds (the race window is small due to the pre-check). The current
speculative token is legitimately a very short lived thing, a thing
that I hesitate to write to disk at all. So our optimistic approach to
inserting speculatively loses its optimism, which I don't like, if you
know what I mean.

OTOH, apart from avoiding stashing things in PGPROC, I do see another
advantage to what you outline. Doing things this way (and making the
insertion and confirmation of a speculative insertion a two-phased
thing) unambiguously fixes all problems with logical decoding, without
adding unexpected cross-change-coordination tricks during transaction
reassembly, and really without much further thought. All we do is get
a new case to decode, that ultimately produces a
REORDER_BUFFER_CHANGE_INSERT change, which Andres more or less wanted
to do anyway.

Under this scheme with using t_ctid, heap_insert() would do minimal
WAL logging, necessary for the same reasons that some WAL logging is
required within heap_lock_tuple() (so the new record type is very
similar to the existing, simple xl_heap_lock record type). We'd use
one of the two remaining bits within t_infomask2 for this, so that
waiters will know to interpret t_ctid as a speculative insertion token
(and then wait on it). Then, after speculative insertion succeeded,
we'd WAL log something closer to an UPDATE to confirm that insertion
was in fact successful, which is where the contents of the new tuple
is actually finally WAL-logged (like UPDATEs used to be, but without a
new tuple being WAL-logged at all, since there of course is none).

Is that more or less what you have in mind?

> A couple of quick random comments:
>> /*
>>  * plan_speculative_use_index
>>  *              Use the planner to decide speculative insertion arbiter
>> index
>>  *
>>  * Among indexes on target of INSERT ... ON CONFLICT, decide which index
>> to
>>  * use to arbitrate taking alternative path.  This should be called
>>  * infrequently in practice, because it's unusual for more than one index
>> to
>>  * be available that can satisfy a user-specified unique index inference
>>  * specification.
>>  *
>>  * Note: caller had better already hold some type of lock on the table.
>>  */
>> Oid
>> plan_speculative_use_index(PlannerInfo *root, List *indexList)
>> {
>>         ...
>>         /* Locate cheapest IndexOptInfo for the target index */
> If I'm reading this correctly, if there are multiple indexes that match the
> unique index inference specification, we pick the cheapest one. Isn't that
> unstable? Two backends running the same INSERT ON CONFLICT statement might
> pick different indexes, and the decision might change over time as the table
> is analyzed. I think we should have a more robust rule. Could we easily just
> use all matching indexes?

Robert feel pretty strongly that there should be a costing aspect to
this. Initially I was skeptical of this, but just did what he said all
the same. But I've since come around to his view entirely because we
now support partial unique indexes.

You can add a predicate to a unique index inference specification, and
you're good, even if there is no partial index on those
attributes/expressions exactly matching the DML/inference predicate -
we use predicate_implied_by() for this, which works with an equivalent
non-partial unique index. This is because a non-partial unique index
covers those attributes sufficiently. There may be value in preferring
the cheap partial index, and then verifying that it actually did cover
the final inserted tuple version (which is how things work now). If we
cannot discriminate against one particular unique index, making sure
it covers the inserted heap tuple (by throwing a user-visible
ERRCODE_TRIGGERED_ACTION_EXCEPTION error if it doesn't, as I currently
do within ExecInsertIndexTuples()) is on shaky ground as a
user-visible behavior. I like that behavior, though - seems less
surprising than letting a failure to actually cover a selective
partial unique index predicate (that of the one and only arbiter
index) slide. You can only get this ERRCODE_TRIGGERED_ACTION_EXCEPTION
error when you actually had a predicate in your unique index inference
specification in the first place, so seems likely that you want the
error. But, I also like supporting unique indexes that are non-partial
even in the presence of a predicate in the unique index inference
specification clause, because, as I said, that's still correct - I can
see not doing this breaking queries when a partial unique index is
dropped due to a more general uniqueness requirement.

Besides, having a list of arbiter unique indexes that must all be
equivalent for the user's purpose anyway is very awkward indeed. It
seems very confusing. If they all have equivalent definitions anyway,
as they must, then it must not matter one bit which one you take
(except perhaps on cost grounds). It might be unstable, but it
couldn't possibly matter, so why bother with anything else? If it
*does* matter on semantics grounds, then we have bigger problems.

There is a tension between indexes as an implementation detail, and
indexes as the thing that define the semantics for the purposes of
this feature here (in a way that isn't baked into the DML statement,
but is only known from a cataloged definition of an index or its
physical structure). I have not fully resolved that tension here, but
I think I have a good practical balance between the two extremes. Does
that make sense?

>> ... Deferred unique constraints are not supported as
>> +   arbiters of whether an alternative <literal>ON CONFLICT</> path
>> +   should be taken.
> We really need to find a shorter term for "arbiter of whether an alternative
> path should be taken". Different variations of that term are used a lot, and
> it's tedious to read.

Okay. I'll think about that.

> I've been thinking that it would be nice to be able to specify a constraint
> name. Naming an index directly feels wrong, as in relational and SQL
> philosophy, indexes are just an implementation detail, but naming a
> constraint is a fair game. It would also be nice to be able to specify "use
> the primary key".

Now that we have partial unique indexes covered, which never have a
constraint (I think - see my remarks above), this is probably okay.
But that doesn't really solve the highly theoretical (non-)issue with
"equals" operator ambiguity, which I called out as an open issue
recently [1]. That's still at thing we need to decide on, even if that
means that you agree with me that it doesn't matter (and since this
naming of constraints in DML is an escape hatch from that problem too,
that seems more likely than before). What do you think?

> Attached patch contains a few more things I saw at a quick read.

This is mostly documentation stuff. I can certainly address this
feedback quickly and easily, and will do so soon.

Peter Geoghegan

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to