During VACUUM of btrees, we need to pin all pages, even those where tuples are not removed, which I am calling here the "pin scan". This is especially a problem during redo, where removing one tuple from a 100GB btree can take a minute or more. That causes replication lags. Bad Thing.
Previously, I have suggested ways of optimizing that and code comments reflect that. I would like to look at removing the pin scan entirely, on a standby only. In www.postgresql.org/message-id/flat/721615179.3351449.1423959585771.javamail.ya...@mail.yahoo.com, the focus was on removing pins from btrees. In the commit message for that thread/patch, http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2ed5b87f96d473962ec5230fd820abfeaccb2069 we see that there are some preconditions to altering the locking. "This patch leaves behavior unchanged for some cases, which can be addressed separately so that each case can be evaluated on its own merits. These unchanged cases are when a scan uses a non-MVCC snapshot, an index-only scan, and a scan of a btree index for which modifications are not WAL-logged. If later patches allow all of these cases to drop the buffer pin after reading a leaf page, then the btree vacuum process can be simplified; it will no longer need the "super-exclusive" lock to delete tuples from a page." The case for master and standby are different. The standby is simpler, yet more important to change since the pin scan happens in the foreground. Looking just at the case for standbys, we see there are 3 cases * non-WAL logged indexes - does not apply on a standby, so ignore * non-MVCC snapshot - looks like only TOAST scans are a problem on standbys * index only scans (IOS) - the analysis of which looks wrong to me... IOSs always check the visibility before using the tuple. If a tuple is about to be removed by a VACUUM then the tuple will already be dead and the visibility map will always be set to not-all-visible. So any tuple being removed by vacuum can never cause a problem to an IOS. Hence the locking interactions are not required, at least on standbys, for normal tables. So it looks like we can skip the "pin scan" during redo unless we are vacuuming a toast index. Patch attached. Notice that the patch does not slacken the requirement to super-exclusive-lock the block from which tuples are being removed. The only thing it does is skip past the requirement to pin each of the intervening blocks where nothing has happened. -- Simon Riggs http://www.2ndQuadrant.com/ <http://www.2ndquadrant.com/> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Description: Binary data
-- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers