On 09/05/2016 06:53 AM, Pavan Deolasee wrote:

On Thu, Sep 1, 2016 at 9:44 PM, Bruce Momjian <br...@momjian.us
<mailto:br...@momjian.us>> wrote:

    On Thu, Sep  1, 2016 at 02:37:40PM +0530, Pavan Deolasee wrote:
    >     I like the simplified approach, as long as it doesn't block further
    >     improvements.
    > Yes, the proposed approach is simple yet does not stop us from improving 
    > further. Moreover it has shown good performance characteristics and I 
    > it's a good first step.

    Agreed.  This is BIG.  Do you think it can be done for PG 10?

I definitely think so. The patches as submitted are fully functional and
sufficient. Of course, there are XXX and TODOs that I hope to sort out
during the review process. There are also further tests needed to ensure
that the feature does not cause significant regression in the worst
cases. Again something I'm willing to do once I get some feedback on the
broader design and test cases. What I am looking at this stage is to
know if I've missed something important in terms of design or if there
is some show stopper that I overlooked.

Latest patches rebased with current master are attached. I also added a
few more comments to the code. I forgot to give a brief about the
patches, so including that as well.

0001_track_root_lp_v4.patch: This patch uses a free t_infomask2 bit to
track latest tuple in an update chain. The t_ctid.ip_posid is used to
track the root line pointer of the update chain. We do this only in the
latest tuple in the chain because most often that tuple will be updated
and we need to quickly find the root only during update.

0002_warm_updates_v4.patch: This patch implements the core of WARM
logic. During WARM update, we only insert new entries in the indexes
whose key has changed. But instead of indexing the real TID of the new
tuple, we index the root line pointer and then use additional recheck
logic to ensure only correct tuples are returned from such potentially
broken HOT chains. Each index AM must implement a amrecheck method to
support WARM. The patch currently implements this for hash and btree


I've been looking at the patch over the past few days, running a bunch of benchmarks etc. I can confirm the significant speedup, often by more than 75% (depending on number of indexes, whether the data set fits into RAM, etc.). Similarly for the amount of WAL generated, although that's a bit more difficult to evaluate due to full_page_writes.

I'm not going to send detailed results, as that probably does not make much sense at this stage of the development - I can repeat the tests once the open questions get resolved.

There's a lot of useful and important feedback in the thread(s) so far, particularly the descriptions of various failure cases. I think it'd be very useful to collect those examples and turn them into regression tests - that's something the patch should include anyway.

I don't really have much comments regarding the code, but during the testing I noticed a bit strange behavior when updating statistics. Consider a table like this:

    create table t (a int, b int, c int) with (fillfactor = 10);
    insert into t select i, i from generate_series(1,1000) s(i);
    create index on t(a);
    create index on t(b);

and update:

    update t set a = a+1, b=b+1;

which has to update all indexes on the table, but:

    select n_tup_upd, n_tup_hot_upd from pg_stat_user_tables

     n_tup_upd | n_tup_hot_upd
          1000 |          1000

So it's still counted as "WARM" - does it make sense? I mean, we're creating a WARM chain on the page, yet we have to add pointers into all indexes (so not really saving anything). Doesn't this waste the one WARM update per HOT chain without actually getting anything in return?

The way this is piggy-backed on the current HOT statistics seems a bit strange for another reason, although WARM is a relaxed version of HOT. Until now, HOT was "all or nothing" - we've either added index entries to all indexes or none of them. So the n_tup_hot_upd was fine.

But WARM changes that - it allows adding index entries only to a subset of indexes, which means the "per row" n_tup_hot_upd counter is not sufficient. When you have a table with 10 indexes, and the counter increases by 1, does that mean the update added index tuple to 1 index or 9 of them?

So I think we'll need two counters to track WARM - number of index tuples we've added, and number of index tuples we've skipped. So something like blks_hit and blks_read. I'm not sure whether we should replace the n_tup_hot_upd entirely, or keep it for backwards compatibility (and to track perfectly HOT updates).


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to