Re: shared_buffers 8GB maximum

Vitaliy Garnashevich Mon, 19 Feb 2018 04:38:43 -0800

Yes. I don't know the exact reason, but reading a buffer from OScache is quite a bit more expensive than just pinning a buffer alreadyin the buffer_pool, about 5 times more expensive the last time Itested it, which was before Meltdown. (And just pinning a bufferwhich is already in the cache is already pretty expensive--about 15times as expensive as reading the next tuple from an already-pinnedbuffer).

Thanks for the numbers. Just out of curiosity, do you happen to know howmuch more expensive compared to that a read from disk is? And also, howmuch the pinning can be slowed down, when having to iterate using theclock-sweep method over large shared_buffers?

I don't think that there is any reason to think that buffers_clean >buffers_checkpoint is a problem. In fact, you could argue that it isthe way it was designed to work. Although the background writer doesneed to tell the checkpointer about every file it dirties, so it canbe fsynced at the end of the checkpoint. The overhead of this wasminimal in my testing.

The reason why I mentioned buffers_clean is because I was assuming thatunder "healthy" conditions, most writes should be done by checkpointer,because, as it was already mentioned, that's the most efficient way ofwriting (no duplicate writes of the same buffer, write optimizationsetc.). I was thinking about bgwriter as a way of reducing latency byavoiding the case when a backend has to write buffers by itself. So thatwould mean that big numbers in buffers_clean and buffers_backendcompared to buffers_checkpoint, would mean that a lot of writes are donenot by checkpointer, and thus probably less efficiently than they couldbe. That might have resulted in IO writes being more random, and more IOwrites done in general, because same buffer can be written multipletimes between checkpoints.

But buffers_backend > buffers_checkpoint could be a problem,especially if they are also much larger than buffers_clean. But thewrinkle here is that if you do bulk inserts or bulk updates (whatabout vacuums?), the backends by design write their own dirtybuffers. So if you do those kinds of things, buffers_backend beinglarge doesn't indicate much. There was a patch someplace a while agoto separate the counters of backend-intentional writes frombackend-no-choice writes, but it never went anywhere.

We do daily manual vacuuming. Knowing what part of total writes isaccounted for them indeed would be nice.

When looking at buffers_checkpoint/buffers_clean/buffers_backend, I wassaving the numbers with several hours interval, knowing that there areno vacuums running at that time, and calculated the difference.

It is not clear to me that this is the best way to measure health. Did your response time go down? Did your throughput go up?

We have mixed type of DB usage. There is OLTP-like part with many smallread/write transactions. Predictable latency does not matter in thatcase, but throughput does, because that is basically a background dataloading job. Then there is an OLAP-like part when heavier report queriesare being run. Then there are more background jobs which are acombination of both, which at first run long queries and then do lots ofsmall inserts, thus pre-calculating some data for bigger reports.

After increasing shared_buffers 8GB -> 64GB, there was 7% improvement inrun time of the background pre-calculating job (measured by runningseveral times in a row, and caches are hot).

When we configured hugepages for the bigger shared_buffers, theadditional improvement was around 3%.


Regards,
Vitaliy

Re: shared_buffers 8GB maximum

Reply via email to