On Fri, Aug 5, 2016 at 2:26 PM, Pavan Deolasee <pavan.deola...@gmail.com> wrote:
> On Fri, Aug 5, 2016 at 8:55 PM, Claudio Freire <klaussfre...@gmail.com>
> wrote:
>> On Fri, Aug 5, 2016 at 1:27 AM, Pavan Deolasee <pavan.deola...@gmail.com>
>> wrote:
>> >
>> >
>> > I don't see why it is hard.  Trying to find the index entry for the old
>> > update row seems odd, and updating an index row seems odd, but skipping
>> > an index addition for the new row seems simple.  Is the problem
>> > concurrent activity?  I assumed already had that ability to add to HOT
>> > chains because we have to lock the update chain.
>> Think of an index over a 1TB table whose keys have only 2 values: true
>> and false.
>> Sure, it's a horrible index. But I've seen things like that, and I've
>> seen cases when they're useful too.
>> So, conceptually, for each key you have circa N/2 tids on the index.
>> nbtree finds the leftmost valid insert point comparing keys, it
>> doesn't care about tids, so to find the index entries that point to
>> the page where the new tuple is, you'd have to scan the N/2 tids in
>> the index, an extremely expensive operation.
> Well, it's always going to be extremely hard to solve for all use cases.
> This is one such extreme case and we should just give up and do cold update.
> I think we can look at the index type (unique vs non-unique) along with
> table statistics to find what fraction of column values are distinct and
> then estimate whether its worthwhile to look for duplicate (key, CTID) or
> just do a cold update. In addition put some cap of how hard we try once we
> decide to check for duplicates and give up after we cross that threshold.

I don't think bailing out in this case is necessary, and it could be
more common than you'd think. It doesn't need to be this extreme to
cause the issue, it only needs equal-key runs that span more than an
index page, and that could be far more common than the extreme example
I gave.

But doing the WARM chain backtracking and checking for previous
versions with appropriate keys should be enough to support this use
case too, it just needs to be well optimized to avoid seriously
impacting performance on the average case.

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to