On Thu, Sep 3, 2015 at 5:11 PM, Andres Freund <and...@anarazel.de> wrote: > > On 2015-09-01 10:19:19 +0530, Amit Kapila wrote: > > pgbench setup > > ------------------------ > > scale factor - 300 > > Data is on magnetic disk and WAL on ssd. > > pgbench -M prepared tpc-b > > > > HEAD - commit 0e141c0f > > Patch-1 - increase_clog_bufs_v1 > > > > The buffer replacement algorithm for clog is rather stupid - I do wonder > where the cutoff is that it hurts. > > Could you perhaps try to create a testcase where xids are accessed that > are so far apart on average that they're unlikely to be in memory? And > then test that across a number of client counts? >
Okay, I have tried one such test, but what I could come up with is on an average every 100th access is a disk access and then tested it with different number of clog buffers and client count. Below is the result: Non-default parameters ------------------------------------ max_connections = 300 shared_buffers=32GB min_wal_size=10GB max_wal_size=15GB checkpoint_timeout =35min maintenance_work_mem = 1GB checkpoint_completion_target = 0.9 wal_buffers = 256MB autovacuum=off HEAD - commit 49124613 Patch-1 - Clog Buffers - 64 Patch-2 - Clog Buffers - 128 Client Count/Patch_ver 1 8 64 128 HEAD 1395 8336 37866 34463 Patch-1 1615 8180 37799 35315 Patch-2 1409 8219 37068 34729 So there is not much difference in test results with different values for Clog buffers, probably because the I/O has dominated the test and it shows that increasing the clog buffers won't regress the current behaviour even though there are lot more accesses for transaction status outside CLOG buffers. Now about the test, create a table with large number of rows (say 11617457, I have tried to create larger, but it was taking too much time (more than a day)) and have each row with different transaction id. Now each transaction should update rows that are at least 1048576 (number of transactions whose status can be held in 32 CLog buffers) distance apart, that way ideally for each update it will try to access Clog page that is not in-memory, however as the value to update is getting selected randomly and that leads to every 100th access as disk access. Test ------- 1. Attached file clog_prep.sh should create and populate the required table and create the function used to access the CLOG pages. You might want to update the no_of_rows based on the rows you want to create in table 2. Attached file access_clog_disk.sql is used to execute the function with random values. You might want to update nrows variable based on the rows created in previous step. 3. Use pgbench as follows with different client count ./pgbench -c 4 -j 4 -n -M prepared -f "access_clog_disk.sql" -T 300 postgres 4. To ensure that clog access function always accesses same data during each run, the test ensures to copy the data_directory created by step-1 before each run. I have checked by adding some instrumentation that approximately every 100th access is disk access, attached patch clog_info-v1.patch adds the necessary instrumentation in code. As an example, pgbench test yields below results: ./pgbench -c 4 -j 4 -n -M prepared -f "access_clog_disk.sql" -T 180 postgres LOG: trans_status(3169396) LOG: trans_status_disk(29546) LOG: trans_status(3054952) LOG: trans_status_disk(28291) LOG: trans_status(3131242) LOG: trans_status_disk(28989) LOG: trans_status(3155449) LOG: trans_status_disk(29347) Here 'trans_status' is the number of times the process went for accessing the CLOG status and 'trans_status_disk' is the number of times it went to disk for accessing CLOG page. > > > /* > > * Number of shared CLOG buffers. > > * > > > > I think the comment should be more drastically rephrased to not > reference individual versions and numbers. > Updated comments and the patch (increate_clog_bufs_v2.patch) containing the same is attached. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
clog_prep.sh
Description: Bourne shell script
access_clog_disk.sql
Description: Binary data
clog_info-v1.patch
Description: Binary data
increase_clog_bufs_v2.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers