Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

Tomas Vondra Wed, 02 Nov 2016 10:20:18 -0700

On 11/02/2016 05:52 PM, Amit Kapila wrote:

On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra
<[email protected]> wrote:

On 11/01/2016 08:13 PM, Robert Haas wrote:


On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra
<[email protected]> wrote:


The one remaining thing is the strange zig-zag behavior, but that might
easily be a due to scheduling in kernel, or something else. I don't consider
it a blocker for any of the patches, though.


The only reason I could think of for that zig-zag behaviour is
frequent multiple clog page accesses and it could be due to below
reasons:

a. transaction and its subtransactions (IIRC, Dilip's case has one
main transaction and two subtransactions) can't fit into same page, in
which case the group_update optimization won't apply and I don't think
we can do anything for it.
b. In the same group, multiple clog pages are being accessed.  It is
not a likely scenario, but it can happen and we might be able to
improve a bit if that is happening.
c. The transactions at same time tries to update different clog page.
I think as mentioned upthread we can handle it by using slots an
allowing multiple groups to work together instead of a single group.

To check if there is any impact due to (a) or (b), I have added few
logs in code (patch - group_update_clog_v9_log). The log message
could be "all xacts are not on same page" or "Group contains
different pages".

Patch group_update_clog_v9_slots tries to address (c). So if there
is any problem due to (c), this patch should improve the situation.

Can you please try to run the test where you saw zig-zag behaviour
with both the patches separately? I think if there is anything due
to postgres, then you can see either one of the new log message or
performance will be improved, OTOH if we see same behaviour, then I
think we can probably assume it due to scheduler activity and move
on. Also one point to note here is that even when the performance is
down in that curve, it is equal to or better than HEAD.


Will do.

Based on the results with more client counts (increment by 6 clients instead of 36), I think this really looks like something unrelated to any of the patches - kernel, CPU, or something already present in current master.


The attached results show that:

(a) master shows the same zig-zag behavior - No idea why this wasn't observed on the previous runs.

(b) group_update actually seems to improve the situation, because the performance keeps stable up to 72 clients, while on master the fluctuation starts way earlier.

I'll redo the tests with a newer kernel - this was on 3.10.x which is what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches you submitted, if the 4.8.6 kernel does not help.


Overall, I'm convinced this issue is unrelated to the patches.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

zig-zag.ods
Description: application/vnd.oasis.opendocument.spreadsheet

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

Reply via email to