On Tue, 2012-12-04 at 01:03 -0800, Jeff Davis wrote: > > 4. We need some general performance testing to show whether this is > > insane or not.
I ran a few tests. Test 1 - find worst-case overhead for the checksum calculation on write: fsync = off bgwriter_lru_maxpages = 0 shared_buffers = 1024MB checkpoint_segments = 64 autovacuum = off The idea is to prevent interference from the bgwriter or autovacuum. Also, I turn of fsync so that it's measuring the calculation overhead, not the effort of actually writing to disk. drop table foo; create table foo(i int, j int) with (fillfactor=50); create index foo_idx on foo(i); insert into foo select g%25, -1 from generate_series(1,10000000) g; checkpoint; -- during the following sleep, issue an OS "sync" -- to make test results more consistent select pg_sleep(30); \timing on update foo set j=-1 where i = 0; select pg_sleep(2); checkpoint; update foo set j=-1 where i = 0; select pg_sleep(2); checkpoint; update foo set j=-1 where i = 0; select pg_sleep(2); checkpoint; \timing off I am measuring the time of the CHECKPOINT command, not the update. The update is just to dirty all of the pages (they should all be HOT updates). Without checksums, it takes about 400ms. With checksums, it takes about 500ms. That overhead is quite low, considering that the bottleneck is almost always somewhere else (like actually writing to disk). Test 2 - worst-case overhead for calculating checksum while reading data Same configuration as above. This time, just load a big table: drop table foo; create table foo(i int, j int) with (fillfactor=50); insert into foo select g%25, -1 from generate_series(1,10000000) g; -- make sure hint bits and PD_ALL_VISIBLE are set everywhere select count(*) from foo; vacuum; vacuum; vacuum; select relfilenode from pg_class where relname='foo'; Then shut down the server and restart it. Then do a "cat data/base/12055/XXXX* > /dev/null" to get the table loaded into the OS buffer cache. Then do: \timing on SELECT COUNT(*) FROM foo; So, shared buffers are cold, but OS cache is warm. This should test the overhead of going from the OS to shared buffers, which requires the checksum calculation. Without checksums is around 820ms; with checksums around 970ms. Again, this is quite reasonable, because I would expect the bottleneck to be reading from the disk rather than the calculation itself. Test 3 - worst-case WAL overhead For this test, I also left fsync off, because I didn't want to test the effort to flush WAL (which shouldn't really be required for this test, anyway). This was simpler: drop table foo; create table foo(i int, j int) with (fillfactor=50); insert into foo select g%25, -1 from generate_series(1,10000000) g; checkpoint; select pg_sleep(1); checkpoint; select pg_sleep(30); -- do an OS "sync" while this is running \timing on SELECT COUNT(*) FROM foo; Without checksums, it takes about 1000ms. With checksums, about 2350ms. I also tested with checksums but without the CHECKPOINT commands above, and it was also 1000ms. This test is more plausible than the other two, so it's more likely to be a real problem. So, the biggest cost of checksums is, by far, the extra full-page images in WAL, which matches our expectations. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers