On 11/02/2016 05:52 PM, Amit Kapila wrote:
On Wed, Nov 2, 2016 at 9:01 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:On 11/01/2016 08:13 PM, Robert Haas wrote:On Mon, Oct 31, 2016 at 5:48 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:The one remaining thing is the strange zig-zag behavior, but that might easily be a due to scheduling in kernel, or something else. I don't consider it a blocker for any of the patches, though.The only reason I could think of for that zig-zag behaviour is frequent multiple clog page accesses and it could be due to below reasons: a. transaction and its subtransactions (IIRC, Dilip's case has one main transaction and two subtransactions) can't fit into same page, in which case the group_update optimization won't apply and I don't think we can do anything for it. b. In the same group, multiple clog pages are being accessed. It is not a likely scenario, but it can happen and we might be able to improve a bit if that is happening. c. The transactions at same time tries to update different clog page. I think as mentioned upthread we can handle it by using slots an allowing multiple groups to work together instead of a single group. To check if there is any impact due to (a) or (b), I have added few logs in code (patch - group_update_clog_v9_log). The log message could be "all xacts are not on same page" or "Group contains different pages". Patch group_update_clog_v9_slots tries to address (c). So if there is any problem due to (c), this patch should improve the situation. Can you please try to run the test where you saw zig-zag behaviour with both the patches separately? I think if there is anything due to postgres, then you can see either one of the new log message or performance will be improved, OTOH if we see same behaviour, then I think we can probably assume it due to scheduler activity and move on. Also one point to note here is that even when the performance is down in that curve, it is equal to or better than HEAD.
Will do.Based on the results with more client counts (increment by 6 clients instead of 36), I think this really looks like something unrelated to any of the patches - kernel, CPU, or something already present in current master.
The attached results show that:(a) master shows the same zig-zag behavior - No idea why this wasn't observed on the previous runs.
(b) group_update actually seems to improve the situation, because the performance keeps stable up to 72 clients, while on master the fluctuation starts way earlier.
I'll redo the tests with a newer kernel - this was on 3.10.x which is what Red Hat 7.2 uses, I'll try on 4.8.6. Then I'll try with the patches you submitted, if the 4.8.6 kernel does not help.
Overall, I'm convinced this issue is unrelated to the patches. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
-- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers