25 окт. 2014 г., в 4:31, Jim Nasby <jim.na...@bluetreble.com> написал(а):

> Please don't top-post.
> 
> On 10/24/14, 3:40 AM, Borodin Vladimir wrote:
>> I have taken some backtraces (they are attached to the letter) of two 
>> processes with such command:
>> pid=17981; while true; do date; gdb -batch -e back 
>> /usr/pgsql-9.4/bin/postgres $pid; echo; echo; echo; echo; sleep 0.1; done
>> 
>> Process 17981 was holding the lock for a long time - 
>> http://pastie.org/9671931.
>> And process 13886 was waiting for lock (in different time and from different 
>> blocker actually but I don’t think it is really important) - 
>> http://pastie.org/9671939.
>> 
>> As I can see, 17981 is actually waiting for LWLock on BufFreelistLock in 
>> StrategyGetBuffer function, freelist.c:134 while holding exclusive lock on 
>> relation. I will try to increase NUM_BUFFER_PARTITIONS (on read-only load it 
>> also gave us some performance boost) and write the result in this thread.
> 
> BufFreelistLock becomes very contended when shared buffers are under a lot of 
> pressure.
> 
> Here's what I believe is happening:
> 
> If RelationGetBufferForTuple() decides it needs to extend, this happens:
>               LockRelationForExtension(relation, ExclusiveLock);
>       buffer = ReadBufferBI(relation, P_NEW, bistate);
> 
> Assuming bistate is false (I didn't check the bulk case), ReadBufferBI() ends 
> up at ReadBuffer_common(), which calls BufferAlloc(). In the normal case, 
> BufferAlloc() won't find the necessary buffer, so it will call 
> StrategyGetBuffer(), which will end up getting the freelist lock. Currently 
> the free list is normally empty, which means we now need to run the clock 
> sweep to find a victim buffer. The clock sweep will keep running until it 
> finds a buffer that is not pinned and has usage_count = 0. If shared buffers 
> are under heavy pressure, you can have a huge number of them with usage_count 
> = 5, which for 100GB shared buffers and an 8K BLKSZ, you could have to check 
> buffers *52 million* times (assuming you finally find a buffer on the start 
> of the 5th loop) before you find a victim.
> 
> Keep in mind that's all happening while you're holding both the extension 
> lock *and the freelist lock*, which basically means no one else in the entire 
> system can allocate a new buffer.

I’ll try the same workload with recent patch from Andres Freund [0].

> 
> This is one reason why a large shared_buffers setting is usually 
> counter-productive. Experience with older versions is that setting it higher 
> than about 8GB is more likely to hurt than to help. Newer versions are 
> probably better, but I think you'll be hard-pressed to find a workload where 
> 100GB makes sense. It might if your entire database fits in shared_buffers 
> (though, even then there's probably a number of O(n) or worse operations that 
> will hurt you), but if your database is > shared_buffers you're probably in 
> trouble.
> 
> I suggest cutting shared_buffers *way* down. Old-school advice for this 
> machine would be 8G (since 25% of 128G would be too big). You might be able 
> to do better than 8G, but I recommend not even trying unless you've got a 
> good way to test your performance.
> 
> If you can test performance and find an optimal setting for shared_buffers, 
> please do share your test data and findings. :)

Of course, it works well with shared_buffers <= 8GB. But we have seen that on 
read-only load when data set fits in RAM with <=8GB shared_buffers we hit 
BufFreelistLock LWLock while moving pages between shared buffers and page 
cache. Increasing shared_buffers size to the size of data set improves 
performance up to 2,5X faster on this read-only load. So we started testing 
configuration with huge shared_buffers under writing load and that’s why I 
started this thread.

Since StrategyGetBuffer() does not use BufFreelistLock LWLock any more [1] I’ll 
also re-run tests with read-only load and small shared_buffers.

[0] 
http://git.postgresql.org/pg/commitdiff/d72731a70450b5e7084991b9caa15cb58a2820df
[1] 
http://git.postgresql.org/pg/commitdiff/1dcfb8da09c47d2a7502d1dfab06c8be4b6cf323

> -- 
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Vladimir




Reply via email to