On Fri, Aug 5, 2016 at 8:55 PM, Claudio Freire <klaussfre...@gmail.com>

> On Fri, Aug 5, 2016 at 1:27 AM, Pavan Deolasee <pavan.deola...@gmail.com>
> wrote:
> >
> >
> > I don't see why it is hard.  Trying to find the index entry for the old
> > update row seems odd, and updating an index row seems odd, but skipping
> > an index addition for the new row seems simple.  Is the problem
> > concurrent activity?  I assumed already had that ability to add to HOT
> > chains because we have to lock the update chain.
> Think of an index over a 1TB table whose keys have only 2 values: true
> and false.
> Sure, it's a horrible index. But I've seen things like that, and I've
> seen cases when they're useful too.
> So, conceptually, for each key you have circa N/2 tids on the index.
> nbtree finds the leftmost valid insert point comparing keys, it
> doesn't care about tids, so to find the index entries that point to
> the page where the new tuple is, you'd have to scan the N/2 tids in
> the index, an extremely expensive operation.
Well, it's always going to be extremely hard to solve for all use cases.
This is one such extreme case and we should just give up and do cold

I think we can look at the index type (unique vs non-unique) along with
table statistics to find what fraction of column values are distinct and
then estimate whether its worthwhile to look for duplicate (key, CTID) or
just do a cold update. In addition put some cap of how hard we try once we
decide to check for duplicates and give up after we cross that threshold.


 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to