On Sun, Apr 23, 2017 at 12:41 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> That's after inlining the compare on both the linear and sequential
>> code, and it seems it lets the compiler optimize the binary search to
>> the point where it outperforms the sequential search.
>> That's not the case when the compare isn't inlined.
>> That seems in line with , that show the impact of various
>> optimizations on both algorithms. It's clearly a close enough race
>> that optimizations play a huge role.
>> Since we're not likely to go and implement SSE2-optimized versions, I
>> believe I'll leave the binary search only. That's the attached patch
> That sounds reasonable based on your test results. I guess part of
> what I was wondering is whether a vacuum on a table large enough to
> require multiple gigabytes of work_mem isn't likely to be I/O-bound
> anyway. If so, a few cycles one way or the other other isn't likely
> to matter much. If not, where exactly are all of those CPU cycles
I haven't been able to produce a table large enough to get a CPU-bound
vacuum, so such a case is likely to require huge storage and a very
powerful I/O system. Mine can only get about 100MB/s tops, and at that
speed, vacuum is I/O bound even for multi-GB work_mem. That's why I've
been using the reported CPU time as benchmark.
BTW, I left the benchmark script running all weekend at the office,
and when I got back a power outage had aborted it. In a few days I'll
be out on vacation, so I'm not sure I'll get the benchmark results
anytime soon. But this patch moved to 11.0 I guess there's no rush.
Just FTR, in case I leave before the script is done, the script got to
scale 400 before the outage:
INFO: vacuuming "public.pgbench_accounts"
INFO: scanned index "pgbench_accounts_pkey" to remove 40000000 row versions
DETAIL: CPU: user: 5.94 s, system: 1.26 s, elapsed: 26.77 s.
INFO: "pgbench_accounts": removed 40000000 row versions in 655739 pages
DETAIL: CPU: user: 3.36 s, system: 2.57 s, elapsed: 61.67 s.
INFO: index "pgbench_accounts_pkey" now contains 0 row versions in 109679 pages
DETAIL: 40000000 index row versions were removed.
109289 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.06 s.
INFO: "pgbench_accounts": found 38925546 removable, 0 nonremovable
row versions in 655738 out of 655738 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 1098
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 15.34 s, system: 6.95 s, elapsed: 126.21 s.
INFO: "pgbench_accounts": truncated 655738 to 0 pages
DETAIL: CPU: user: 0.22 s, system: 2.10 s, elapsed: 8.10 s.
s100: CPU: user: 3.02 s, system: 1.51 s, elapsed: 16.43 s.
s400: CPU: user: 15.34 s, system: 6.95 s, elapsed: 126.21 s.
The old results:
Old Patched (sequential search):
s100: CPU: user: 3.21 s, system: 1.54 s, elapsed: 18.95 s.
s400: CPU: user: 14.03 s, system: 6.35 s, elapsed: 107.71 s.
s4000: CPU: user: 228.17 s, system: 108.33 s, elapsed: 3017.30 s.
s100: CPU: user: 3.39 s, system: 1.64 s, elapsed: 18.67 s.
s400: CPU: user: 15.39 s, system: 7.03 s, elapsed: 114.91 s.
s4000: CPU: user: 282.21 s, system: 105.95 s, elapsed: 3017.28 s.
I wouldn't fret over the slight slowdown vs the old patch, it could be
noise (the script only completed a single run at scale 400).
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: