On 01/23/2017 08:30 AM, Amit Kapila wrote:
On Sun, Jan 22, 2017 at 3:43 PM, Tomas Vondra
<tomas.von...@2ndquadrant.com> wrote:

That being said, I'm ready to do some benchmarking on this, so that we have
at least some numbers to argue about. Can we agree on a set of workloads
that we want to benchmark in the first round?

I think if we can get data for pgbench read-write workload when data
doesn't fit in shared buffers but fit in RAM, that can give us some
indication.  We can try by varying the ratio of shared buffers w.r.t
data.  This should exercise the checksum code both when buffers are
evicted and at next read.  I think it also makes sense to check the
WAL data size for each of those runs.

Yes, I'm thinking that's pretty much the worst case for OLTP-like workload, because it has to evict buffers from shared buffers, generating a continuous stream of writes. Doing that on good storage (e.g. PCI-e SSD or possibly tmpfs) will further limit the storage overhead, making the time spent computing checksums much more significant. Makes sense?

What about other types of workload? I think we should not look just at write-heavy workloads - I wonder what is the overhead of verifying the checksums in read-only workloads (again, with data that fits into RAM).

What about large data loads simulating OLAP, and exports (e.g. pg_dump)?

That leaves us with 4 workload types, I guess:

1) read-write OLTP (shared buffers < data < RAM)
2) read-only OLTP (shared buffers < data < RAM)
3) large data loads (COPY)
4) large data exports (pg_dump)

Anything else?

The other question is of course hardware - IIRC there are differences between CPUs. I do have a new e5-2620v4, but perhaps it'd be good to also do some testing on a Power machine, or an older Intel CPU.


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to