On 05/10/2016 07:36 PM, Robert Haas wrote:
On Tue, May 10, 2016 at 12:31 PM, Tomas Vondra
The following table shows the differences between the disabled and reverted
cases like this:
sum('reverted' results with N clients)
---------------------------------------- - 1.0
sum('disabled' results with N clients)
for each scale/client count combination. So for example 4.83% means with a
single client on the smallest data set, the sum of the 5 runs for reverted
was about 1.0483x than for disabled.
scale 1 16 32 64 128
100 4.83% 2.84% 1.21% 1.16% 3.85%
3000 1.97% 0.83% 1.78% 0.09% 7.70%
10000 -6.94% -5.24% -12.98% -3.02% -8.78%
/me scratches head.
That doesn't seem like noise, but I don't understand the
scale-factor-10000 results either. Reverting the patch makes the code
smaller and removes instructions from critical paths, so it should
speed things up at least nominally. The question is whether it makes
enough difference that anyone cares. However, removing unused code
shouldn't make the system *slower*, but that's what's happening here
at the higher scale factor.
/me scratches head too
I've seen cases where adding dummy instructions to critical paths
slows things down at 1 client and speeds them up with many clients.
That happens because the percentage of time active processes fighting
over the critical locks goes down, which reduces contention more than
enough to compensate for the cost of executing the dummy
instructions. If your results showed performance lower at 1 client
and slightly higher at many clients, I'd suspect an effect of that
sort. But I can't see why it should depend on the scale factor. That
suggests that, perhaps, it's having some effect on the impact of
buffer eviction, maybe due to a difference in shared memory layout.
But I thought we weren't supposed to have such artifacts any more
now that we start every allocation on a cache line boundary...
I think we should look for issues in the testing procedure first,
perhaps try to reproduce it on a different system. Another possibility
is that the revert is not perfectly correct - the code compiles and does
not crash, but maybe there's a subtle issue somewhere.
I'll try to collect some additional info (detailed info from sar,
aggregated transaction log, ...) for further analysis. And also increase
the number of runs, so that we can better compare all the separate
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: