25 окт. 2014 г., в 4:31, Jim Nasby написал(а):
> Please don't top-post.
>
> On 10/24/14, 3:40 AM, Borodin Vladimir wrote:
>> I have taken some backtraces (they are attached to the letter) of two
>> processes with such command:
>> pid=17981; while true; do date; gdb -batch -e back
>> /usr/pgsql-9.4/bin/postgres $pid; echo; echo; echo; echo; sleep 0.1; done
>>
>> Process 17981 was holding the lock for a long time -
>> http://pastie.org/9671931.
>> And process 13886 was waiting for lock (in different time and from different
>> blocker actually but I don’t think it is really important) -
>> http://pastie.org/9671939.
>>
>> As I can see, 17981 is actually waiting for LWLock on BufFreelistLock in
>> StrategyGetBuffer function, freelist.c:134 while holding exclusive lock on
>> relation. I will try to increase NUM_BUFFER_PARTITIONS (on read-only load it
>> also gave us some performance boost) and write the result in this thread.
>
> BufFreelistLock becomes very contended when shared buffers are under a lot of
> pressure.
>
> Here's what I believe is happening:
>
> If RelationGetBufferForTuple() decides it needs to extend, this happens:
> LockRelationForExtension(relation, ExclusiveLock);
> buffer = ReadBufferBI(relation, P_NEW, bistate);
>
> Assuming bistate is false (I didn't check the bulk case), ReadBufferBI() ends
> up at ReadBuffer_common(), which calls BufferAlloc(). In the normal case,
> BufferAlloc() won't find the necessary buffer, so it will call
> StrategyGetBuffer(), which will end up getting the freelist lock. Currently
> the free list is normally empty, which means we now need to run the clock
> sweep to find a victim buffer. The clock sweep will keep running until it
> finds a buffer that is not pinned and has usage_count = 0. If shared buffers
> are under heavy pressure, you can have a huge number of them with usage_count
> = 5, which for 100GB shared buffers and an 8K BLKSZ, you could have to check
> buffers *52 million* times (assuming you finally find a buffer on the start
> of the 5th loop) before you find a victim.
>
> Keep in mind that's all happening while you're holding both the extension
> lock *and the freelist lock*, which basically means no one else in the entire
> system can allocate a new buffer.
I’ll try the same workload with recent patch from Andres Freund [0].
>
> This is one reason why a large shared_buffers setting is usually
> counter-productive. Experience with older versions is that setting it higher
> than about 8GB is more likely to hurt than to help. Newer versions are
> probably better, but I think you'll be hard-pressed to find a workload where
> 100GB makes sense. It might if your entire database fits in shared_buffers
> (though, even then there's probably a number of O(n) or worse operations that
> will hurt you), but if your database is > shared_buffers you're probably in
> trouble.
>
> I suggest cutting shared_buffers *way* down. Old-school advice for this
> machine would be 8G (since 25% of 128G would be too big). You might be able
> to do better than 8G, but I recommend not even trying unless you've got a
> good way to test your performance.
>
> If you can test performance and find an optimal setting for shared_buffers,
> please do share your test data and findings. :)
Of course, it works well with shared_buffers <= 8GB. But we have seen that on
read-only load when data set fits in RAM with <=8GB shared_buffers we hit
BufFreelistLock LWLock while moving pages between shared buffers and page
cache. Increasing shared_buffers size to the size of data set improves
performance up to 2,5X faster on this read-only load. So we started testing
configuration with huge shared_buffers under writing load and that’s why I
started this thread.
Since StrategyGetBuffer() does not use BufFreelistLock LWLock any more [1] I’ll
also re-run tests with read-only load and small shared_buffers.
[0]
http://git.postgresql.org/pg/commitdiff/d72731a70450b5e7084991b9caa15cb58a2820df
[1]
http://git.postgresql.org/pg/commitdiff/1dcfb8da09c47d2a7502d1dfab06c8be4b6cf323
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Vladimir