On 2017-07-20 17:59, Robert Haas wrote:
On Tue, Jul 18, 2017 at 6:09 AM, Sokolov Yura
I investigated autovacuum performance, and found that it suffers a lot
from small ring buffer. It suffers in a same way bulk writer suffered
before Tom Lane's commit 6382448cf96:
Tom Lane <t...@sss.pgh.pa.us> 2009-06-23 00:04:28
For bulk write operations (eg COPY IN), use a ring buffer of 16MB
instead of the 256KB limit originally enforced by a patch committed
2008-11-06. Per recent test results, the smaller size resulted in an
undesirable decrease in bulk data loading speed, due to COPY
processing frequently getting blocked for WAL flushing. This area
might need more tweaking later, but this setting seems to be good
enough for 8.4.
It is especially noticable when database doesn't fit in shared_buffers
but fit into OS file cache, and data is intensively updated (ie OLTP
load). In this scenario autovacuum with current 256kB (32 pages) ring
buffer lasts 3-10 times longer than with increased to 16MB ring
I've tested with synthetic load with 256MB or 1GB shared buffers and
2-6GB (with indices) tables, with different load factor and
secondary indices on updated columns. Table were randomly updated with
hot and non-hot updates. Times before/after buffer increase (depending
on load) were 7500sec/1200sec, 75000sec/11500sec. So benefit is
I didn't tested cases when database doesn't fit OS file cache.
in this case benefit will be smaller cause more time will be spent in
I didn't test intensively OLAP load. I've seen once that increased
buffer slows a bit scanning almost immutable huge table, perhaps cause
of decreased CPU cache locality. But given such scan is already fast,
and autovacuum of "almost immutable table" runs rarely, I don't think
it is very important.
Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
configurable, but after testing I don't see much gain from increasing
ring buffer above 16MB. So I propose just 1 line change.
I think the question for this patch is "so, why didn't we do it this
It's no secret that making the ring buffer larger will improve
performance -- in fact, not having a ring buffer at all would improve
performance even more. But it would also increase the likelihood that
the background work of vacuum would impact the performance of
foreground operations, which is already a pretty serious problem that
we probably don't want to make worse. I'm not certain what the right
decision is here, but I think that a careful analysis of possible
downsides is needed.
The Enterprise PostgreSQL Company
Initially, ring buffer were introduced for sequential scan.
It was added for vacuum "for a company", and before introducing
vacuum used just 1 page, so giving 32 pages to was huge improvement:
d526575f893c1a4e05ebd Tom Lane <t...@sss.pgh.pa.us> 2007-05-31 00:12:03
"Make large sequential scans and VACUUMs work in a limited-size "ring"
buffers, rather than blowing out the whole shared-buffer arena. Aside
avoiding cache spoliation, this fixes the problem that VACUUM formerly
to cause a WAL flush for every page it modified, because we had it
use only a single buffer."
Later ring buffer were added for bulk writer, but the same 32 pages:
85e2cedf985bfecaf43a18ca Tom Lane <t...@sss.pgh.pa.us> 2008-11-06
" Improve bulk-insert performance by keeping the current target buffer
(but not locked, as that would risk deadlocks). Also, make it work in
ring of buffers to avoid having bulk inserts trash the whole buffer
Robert Haas, after an idea of Simon Riggs'."
And finally after some real world usage buffer for bulk writer were
6382448cf96a9b88 Tom Lane <t...@sss.pgh.pa.us> 2009-06-23 00:04:28
" For bulk write operations (eg COPY IN), use a ring buffer of 16MB
of the 256KB limit originally enforced by a patch committed
Per recent test results, the smaller size resulted in an undesirable
in bulk data loading speed, due to COPY processing frequently getting
for WAL flushing. This area might need more tweaking later, but this
seems to be good enough for 8.4."
So, from my point of view, no one just evaluate performance of increased
ring buffer for vacuum.
It were discussed year age:
There was your, Robert wrong assumption:
But all that does is force the backend to write to the operating
system, which is where the real buffering happens.
But in fact, vacuum process performs FSYNC! It happens, cause vacuum
evicts dirty pages from its ring buffer. And to evict dirty page, it
has to be sure WAL record about its modification is FSYNC-ed to WAL.
Because ring buffer is damn small, vacuum almost always have to perform
FSYNC by itself and have to do it very frequently (cause ring is damn
With greater ring buffer, there is greater chance that fsync were
performed by wal_writer process, or other backend. And even if
vacuum does fsync by itself, it syncs records about more modified
pages from its ring, so evicting next page is free.
Also it were discussed in 2012-2013
No decision were made, unfortunately.
If some fears that increasing vacuum ring buffer will lead to
decreasing transaction performance, then why it were not exhaustively
I had no evidence that transactions suffers from autovacuum improved
in this way. Perhaps I tested not precisely.
People, lets collect more results! Please test this one line change
with load of your choice and share the results!
If improving autovacuum doesn't hurt performance much, then why we
should live with databases bloated?
If such improvement really hurts performance, then it should be
documented by letters in pgsql-hackers, and comment should be put
into src/backend/storage/buffer/freelist.c .
And possible community will decide, that it should be GUC variable:
- if one prefers to keep its database unbloated, he could increase
vacuum ring buffer,
- otherwise just left it in "safe-but-slow" default.
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: