On 10/06/2016 07:36 AM, Pavan Deolasee wrote:


On Wed, Oct 5, 2016 at 1:43 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com <mailto:tomas.von...@2ndquadrant.com>> wrote:

...
    I can confirm the significant speedup, often by more than 75%
    (depending on number of indexes, whether the data set fits into RAM,
    etc.). Similarly for the amount of WAL generated, although that's a
    bit more difficult to evaluate due to full_page_writes.

    I'm not going to send detailed results, as that probably does not
    make much sense at this stage of the development - I can repeat the
    tests once the open questions get resolved.


Sure. Anything that stands out? Any regression that you see? I'm not
sure if your benchmarks exercise the paths which might show overheads
without any tangible benefits. For example, I wonder if a test with many
indexes where most of them get updated and then querying the table via
those updated indexes could be one such test case.


No, nothing that would stand out. Let me explain what benchmark(s) I've done. I've made some minor mistakes when running the benchmarks, so I plan to rerun them and post the results after that. So let's take the data with a grain of salt.

My goal was to compare current non-HOT behavior (updating all indexes) with the WARM (updating only indexes on modified columns), and I've taken two approaches:

1) fixed number of indexes, update variable number of columns

Create a table with 8 secondary indexes and then run a bunch of benchmarks updating increasing number of columns. So the first run did

    UPDATE t SET c1 = c1+1 WHERE id = :id;

while the second did

    UPDATE t SET c1 = c1+1, c2 = c2+1 WHERE id = :id;

and so on, up to updating all the columns in the last run. I've used multiple scripts to update all the columns / indexes uniformly (essentially using multiple "-f" flags with pgbench). The runs were fairly long (2h, enough to get stable behavior).

For a small data set (fits into RAM), the results look like this:

         master  patched     diff
    1    5994       8490     +42%
    2    4347       7903     +81%
    3    4340       7400     +70%
    4    4324       6929     +60%
    5    4256       6495     +52%
    6    4253       5059     +19%
    7    4235       4534     +7%
    8    4194       4237     +1%

and the amount of WAL generated (after correction for tps difference) looks like this (numbers are MBs)

         master   patched    diff
    1     27257     18508    -32%
    2     21753     14599    -33%
    3     21912     15864    -28%
    4     22021     17135    -22%
    5     21819     18258    -16%
    6     21929     20659     -6%
    7     21994     22234     +1%
    8     21851     23267     +6%

So this is quite significant difference. I'm pretty sure the minor WAL increase for the last two runs is due to full page writes (which also affects the preceding runs, making the WAL reduction smaller than the tps increase).

I do have results for larger data sets (>RAM), the results are very similar although the speedup seems a bit smaller. But I need to rerun those.

2) single-row update, adding indexes between runs

This is kinda the opposite of the previous approach, i.e. transactions always update a single column (multiple scripts to update the columns uniformly), but there are new indexes added between runs. The results (for a large data set, exceeding RAM) look like this:

         master   patched    diff
    0       954     1404     +47%
    1       701     1045     +49%
    2       484      816     +70%
    3       346      683     +97%
    4       248      608     +145%
    5       190      525     +176%
    6       152      397     +161%
    7       123      315     +156%
    8       123      270     +119%

So this looks really interesting.


    There's a lot of useful and important feedback in the thread(s) so
    far, particularly the descriptions of various failure cases. I think
    it'd be very useful to collect those examples and turn them into
    regression tests - that's something the patch should include anyway.


Sure. I added only a handful test cases which I knew regression isn't
covering. But I'll write more of them. One good thing is that the code
gets heavily exercised even during regression. I caught and fixed
multiple bugs running regression. I'm not saying that's enough, but it
certainly gives some confidence.


I don't see any changes to src/test in the patch, so I'm not sure what you mean when you say you added a handful of test cases?



    and update:

        update t set a = a+1, b=b+1;

    which has to update all indexes on the table, but:

        select n_tup_upd, n_tup_hot_upd from pg_stat_user_tables

         n_tup_upd | n_tup_hot_upd
        -----------+---------------
              1000 |          1000

    So it's still counted as "WARM" - does it make sense?


No, it does not. The code currently just marks any update as a WARM
update if the table supports it and there is enough free space in the
page. And yes, you're right. It's worth fixing that because of one-WARM
update per chain limitation. Will fix.


Hmmm, so this makes monitoring of %WARM during benchmarks less reliable than I hoped for :-(


    The way this is piggy-backed on the current HOT statistics seems a
    bit strange for another reason,


Agree. We could add a similar n_tup_warm_upd counter.


Yes, although HOT is a special case of WARM. But it probably makes sense to differentiate them, I guess.


    But WARM changes that - it allows adding index entries only to a
    subset of indexes, which means the "per row" n_tup_hot_upd counter
    is not sufficient. When you have a table with 10 indexes, and the
    counter increases by 1, does that mean the update added index tuple
    to 1 index or 9 of them?


How about having counters similar to n_tup_ins/n_tup_del for indexes
as well? Today it does not make sense because every index gets the
same number of inserts, but WARM will change that.

For example, we could have idx_tup_insert and idx_tup_delete that shows
up in pg_stat_user_indexes. I don't know if idx_tup_delete adds any
value, but one can then look at idx_tup_insert for various indexes to
get a sense which indexes receives more inserts than others. The indexes
which receive more inserts are the ones being frequently updated as
compared to other indexes.


Hmmm, I'm not sure that'll work. I mean, those metrics would be useful (although I can't think of a use case for idx_tup_delete), but I'm not sure it's a enough to measure WARM. We need to compute

    index_tuples_inserted / index_tuples_total

where (index_tuples_total - index_tuples_inserted) is the number of index tuples we've been able to skip thanks to WARM. So we'd also need to track the number of index tuples that we skipped for the index, and I'm not sure that's a good idea.

Also, we really don't care about inserted tuples - what matters for WARM are updates, so idx_tup_insert is either useless (because it also includes non-UPDATE entries) or the naming is misleading.

This also relates to vacuuming strategies. Today HOT updates do not
count for triggering vacuum (or to be more precise, HOT pruned tuples
are discounted while counting dead tuples). WARM tuples get the same
treatment as far as pruning is concerned, but since they cause fresh
index inserts, I wonder if we need some mechanism to cleanup the dead
line pointers and dead index entries. This will become more important if
we do something to convert WARM chains into HOT chains, something that
only VACUUM can do in the design I've proposed so far.


True.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to