Re: Lock contention high

2021-11-29 Thread arjun shetty
1. How to check which NUMA node in PostgreSQL process fetching from the
memory?

2. Is NUMA configuration is better for PostgreSQL?
  vm.zone_reclaim_mode= 0
   numactl --interleave = all  /init.d/ PostgreSQL start
kernel.numa_balancing= 0





On Wednesday, November 17, 2021, arjun shetty 
wrote:

> Hi Askhil
>
> PostgreSQL utilizes  lightweight locks(LWLocks) to synchronize and
> control access to the buffer content. A process acquires an LWLock in a
> shared mode to read from the buffer and an exclusive mode  to write to
> the buffer. Therefore, while holding an exclusive lock, a process prevents
> other processes from acquiring a shared or exclusive lock. Also, a shared
> lock can be acquired concurrently by other processes. The issue starts when
> many processes acquire an exclusive lock on buffer content. As a result,
> LwlockAcquire seen as top hot function in profilng.
> Here  need to understand LwlockAcquire is lock contention or cpu time
> spent inside the method/ function(top function in profiling)
>
> It can analysed log  “LwStatus” with parameters like
> ex-acquire-count(exclusive mode) , sh-acquire-count , block-count and
> spin-delay-count
>
> Total lock acquisition request = ex-acquire-count+sh-acquire-count)
> Time lock contention %= block count)/ Total lock acquisition request.
>
> Time lock contention may provide as most of cpu time inside the function
> rather than spinning/ waiting for lock.
>
> On Friday, November 12, 2021, Ashkil Dighin 
> wrote:
>
>> Hi
>> I suspect lock contention and performance issues with __int128. And I
>> would like to check the performance by forcibly disabling
>> int128(Maxalign16bytes) and enable like long long(maxlign 8bytes).
>>  Is it possible to disable int128 in PostgreSQL?
>>
>> On Thursday, October 28, 2021, Andres Freund  wrote:
>>
>>> Hi,
>>>
>>> On October 27, 2021 2:44:56 PM PDT, Ashkil Dighin <
>>> ashkildighi...@gmail.com> wrote:
>>> >Hi,
>>> >Yes, lock contention reduced with postgresqlv14.
>>> >Lock acquire reduced 18% to 10%
>>> >10.49 %postgres  postgres[.] LWLockAcquire
>>> >5.09%  postgres  postgres[.] _bt_compare
>>> >
>>> >Is lock contention can be reduced to 0-3%?
>>>
>>> Probably not, or at least not easily. Because of the atomic instructions
>>> the locking also includes  some other costs (e.g. cache misses, serializing
>>> store buffers,...).
>>>
>>> There's a good bit we can do to increase the cache efficiency around
>>> buffer headers, but it won't get us quite that low I'd guess.
>>>
>>>
>>> >On pg-stat-activity shown LwLock as “BufferCounter” and “WalInsert”
>>>
>>> Without knowing what proportion they have to each and to non-waiting
>>> backends that unfortunately doesn't help that much..
>>>
>>> Andres
>>>
>>> --
>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>>
>>


Re: Need help identifying a periodic performance issue.

2021-11-29 Thread Robert Creager


> On Nov 24, 2021, at 4:15 PM, Justin Pryzby  wrote:
>
> This message originated outside your organization.
>
> On Wed, Nov 24, 2021 at 10:44:12PM +, Robert Creager wrote:
>> I forgot, I had reloaded postgres, but had not re-started our app, so the 
>> connections wouldn’t have that plan setting on them. Re-doing now.
>
> Are you sure?  GUC changes should be applied for existing sessions, right ?
>
> Would you send the logs surrounding the slow COPY ?
> Specifically including the autovacuum logs.

Here are the log lines 5 minutes leading up to the 2min copy operation 
happening.  There is no vacuum activity. The previous auto vacuum happened 20 
minutes earlier on a different table.



>
>> We are at it again.  I have a DELETE operation that’s taking 48 minutes so 
>> far.
>
> Before, you had slow COPY due to FKs.  Now you have a slow DELETE, which you
> only alluded to before.

Yeah, I had not been able to reproduce it previously with logging/dtracing 
enabled. And I was able to look at the query plan as I saw it happening.

And we’ve run across another problem query, which is also hitting that ds3.blob 
table.

INFO Nov 25 05:30:05,787 [WorkLogger] | Still in progress after 30 minutes: 
[IomDriverWorker] SQL: SELECT * FROM ds3.s3_object_property WHERE (key = 
'x-amz-meta-o-spectra-backup-start-date' AND EXISTS (SELECT * FROM 
ds3.s3_object WHERE id = ds3.s3_object_property.object_id AND ((EXISTS (SELECT 
* FROM ds3.bucket WHERE id = ds3.s3_object.bucket_id AND (name LIKE 
'Spectra%')) AND NOT EXISTS (SELECT * FROM ds3.blob WHERE object_id = 
ds3.s3_object.id AND (EXISTS (SELECT * FROM ds3.job...  
(MonitoredWorkManager$WorkLogger.run:84)

>
>> So how do we avoid this query plan? Do we need to start doing explicit 
>> analyzes after every delete?
>
> If your DELETE is deleting the entire table, then I think you should VACUUM
> anyway (or else the next inserts will bloat the table).

We’re not deleting the entire table necessarily, we don’t know, customer driven 
thing.  In general, the COPY table used will not see a lot of deletes, this in 
from the test group, which is deleting a lot of data.

>
> But first, I believe Thomas was suggesting to put plan_cache_mode back to its
> default, and (for testing purposes) try using issue DISCARD PLANS.

Ok, I’ll do that now and see what happens with the COPY.

>
> On Fri, Nov 19, 2021 at 10:08:02AM +1300, Thomas Munro wrote:
>> Just to understand what's going on, it'd be interesting to know if the
>> problem goes away if you *just* inject the DISCARD PLANS statement
>> before running your COPYs, but if that doesn't help it'd also be
>> interesting to know what happens if you ANALYZE each table after each
>> COPY.  Are you running any explicit ANALYZE commands?  How long do
>> your sessions/connections live for?
>
> --
> Justin
>



lines.log.bz2
Description: lines.log.bz2