On Mon, Dec 30, 2013 at 7:22 AM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
> Ah, I didn't remember that thread. Yeah, apparently I proposed the exact
> same design back then. Simon complained about the dead tuples being left
> behind, but I don't think that's a big issue with the design we've been
> playing around now; you only end up with dead tuples when two backends try
> to insert the same value concurrently, which shouldn't happen very often.

Right, because you check first, which also has a cost, paid in CPU
cycles and memory bandwidth, and buffer lock contention. As opposed to
a cost almost entirely localized to inserters into a single leaf page
per unique index for only an instant. You're checking *all* unique
indexes.

You call check_exclusion_or_unique_constraint() once per unique index
(or EC), and specify to wait on the xact, at least until a conflict is
found. So if you're waiting on an xact, your conclusion that earlier
unique indexes had no conflicts could soon become totally obsolete. So
for non-idiomatic usage, say like the usage Andres in particular cares
about for MM conflict resolution, I worry about the implications of
that. I'm not asserting that it's a problem, but it does seem like
something that's quite hard to reason about. Maybe Andres can comment.

>> In short, I think that my approach may be better because it doesn't
>> conflate row locking with value locking (therefore making it easier
>> to reason about, maintaining modularity),
>
> You keep saying that, but I don't understand what you mean. With your
> approach, an already-inserted heap tuple acts like a value lock, just like
> in my approach. You have the page-level locks on b-tree pages in addition to
> that, but the heap-tuple based mechanism is there too.

Yes, but that historic behavior isn't really value locking at all.
That's very much like row locking, because there is a row, not the
uncertain intent to try to insert a row. Provided your transaction
commits and the client's transaction doesn't delete the row, the row
is definitely there. For upsert, conflicts may well be the norm, not
the exception.

Value locking is the exclusive lock on the buffer held during
_bt_check_unique(). I'm trying to safely extend that mechanism, to
reach consensus among unique indexes, which to me seems the most
logical and conservative approach. For idiomatic usage, it's only
sensible for there to be a conflict on one unique index, known ahead
of time. If you don't know where the conflict will be, then typically
your DML statement is unpredictable, just like the MySQL feature.
Otherwise, for MM conflict resolution, I think it makes sense to pick
those conflicts off, one at a time, dealing with exactly one row per
conflict. I mean, even with your approach, you're still not dealing
with later conflicts in later unique indexes, right? The fact that you
prevent conflicts on previously non-conflicting unique indexes only,
and, I suppose, not later ones too, seems quite arbitrary.

> Well, killing an old promise tuple is not cheap, but it shouldn't happen
> often. In both approaches, what probably matters more is the overhead of the
> extra heavy-weight locking. But this is all speculation, until we see some
> benchmarks.

Fair enough. We'll know more when we have fixed the exclusion
constraint supporting patch, which will allow us to make a fair
comparison. I'm working on it. Although I'd suggest that having dead
duplicates in indexes where that's avoidable is a cost that isn't
necessarily that easily characterized. I especially don't like that
you're currently doing the UNIQUE_CHECK_PARTIAL deferred unique
constraint thing of always inserting, continuing on for *all* unique
indexes regardless of finding a duplicate. Whatever overhead my
approach may imply around lock contention, clearly the cost to index
scans is zero.

The other thing is that if you're holding old "value locks" (i.e.
previously inserted btree tuples, from earlier unique indexes) pending
resolving a value conflict, you're holding those value locks
indefinitely pending the completion of the other guy's xact, just in
case there ends up being no conflict, which in general is unlikely. So
in extreme cases, that could be the difference between waiting all day
(on someone sitting on a lock that they very probably have no use
for), and not waiting at all.

> _bt_check_unique() is a modularity violation, agreed. Beauty is in the eye
> of the beholder, I guess, but I don't see either patch making that any
> better or worse.

Clearly the way in which you envisage releasing locks to prevent
unprincipled deadlocking implies that btree has to know more about the
heap, and maybe even that the heap has to know something about btree,
or at least about amcanunique AMs (including possible future
amcanunique AMs that may or may not be well suited to implementing
this the same way).

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to