On Wed, Aug 31, 2016 at 10:38 PM, Claudio Freire <klaussfre...@gmail.com>
> On Wed, Aug 31, 2016 at 1:45 PM, Pavan Deolasee <pavan.deola...@gmail.com>
>> We discussed a few ideas to address the "Duplicate Scan" problem. For
>> example, we can teach Index AMs to discard any duplicate (key, CTID) insert
>> requests. Or we could guarantee uniqueness by either only allowing updates
>> in one lexical order. While the former is a more complete solution to avoid
>> duplicate entries, searching through large number of keys for non-unique
>> indexes could be a drag on performance. The latter approach may not be
>> sufficient for many workloads. Also tracking increment/decrement for many
>> indexes will be non-trivial.
>> There is another problem with allowing many index entries pointing to the
>> same WARM chain. It will be non-trivial to know how many index entries are
>> currently pointing to the WARM chain and index/heap vacuum will throw up
>> more challenges.
>> Instead, what I would like to propose and the patch currently implements
>> is to restrict WARM update to once per chain. So the first non-HOT update
>> to a tuple or a HOT chain can be a WARM update. The chain can further be
>> HOT updated any number of times. But it can no further be WARM updated.
>> This might look too restrictive, but it can still bring down the number of
>> regular updates by almost 50%. Further, if we devise a strategy to convert
>> a WARM chain back to HOT chain, it can again be WARM updated. (This part is
>> currently not implemented). A good side effect of this simple strategy is
>> that we know there can maximum two index entries pointing to any given WARM
> We should probably think about coordinating with my btree patch.
> From the description above, the strategy is quite readily "upgradable" to
> one in which the indexam discards duplicate (key,ctid) pairs and that would
> remove the limitation of only one WARM update... right?
Yes, we should be able to add further optimisations on lines you're working
on, but what I like about the current approach is that a) it reduces
complexity of the patch and b) having thought about cleaning up WARM
chains, limiting number of index entries per root chain to a small number
will simplify that aspect too.
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services