On Oct 4, 2010, at 1:23 PM, Heikki Linnakangas wrote: > On 04.10.2010 14:02, Hans-Jürgen Schönig wrote: >> it seems we have found a fairly nasty problem. >> imagine a long transaction which piles up XX.XXX of locks (count on >> pg_locks) inside the same transaction by doing some tasty savepoints, with >> hold cursors and so on. >> in this case we see that a normal count issued in a second database >> connection will take ages. in a practical case we did a plain seq_scan in >> connection 2. instead of 262 ms (cached case) it started to head north >> linearily with the number of locks taken by connection 1. in an extreme case >> it took around 1.5 hours or so (on XXX.XXX pg_locks entries). >> >> i tracked down the issue quickly and make the following profile (in 10k >> locks or so): >> >> Flat profile: >> >> Each sample counts as 0.01 seconds. >> % cumulative self self total >> time seconds seconds calls s/call s/call name >> 32.49 6.01 6.01 98809118 0.00 0.00 >> SimpleLruReadPage_ReadOnly >> 26.97 11.00 4.99 98837761 0.00 0.00 LWLockAcquire >> 21.89 15.05 4.05 98837761 0.00 0.00 LWLockRelease >> 8.70 16.66 1.61 98789370 0.00 0.00 SubTransGetParent >> 4.38 17.47 0.81 19748 0.00 0.00 >> SubTransGetTopmostTransaction >> 2.41 17.92 0.45 98851951 0.00 0.00 TransactionIdPrecedes >> 0.59 18.03 0.11 LWLockAssign >> 0.54 18.13 0.10 >> LWLockConditionalAcquire >> 0.46 18.21 0.09 19748 0.00 0.00 TransactionLogFetch >> 0.38 18.28 0.07 SimpleLruReadPage >> 0.27 18.33 0.05 SubTransSetParent >> 0.05 18.34 0.01 136778 0.00 0.00 AllocSetAlloc >> 0.05 18.35 0.01 42996 0.00 0.00 slot_deform_tuple >> 0.05 18.36 0.01 42660 0.00 0.00 >> TransactionIdIsCurrentTransactionId >> >> it seems we are running into a nice shared buffer / locking contention here >> and the number of calls explodes (this profiling infos is coming from a seq >> scan on a 500.000 rows table - 400 mb or so). > > That doesn't seem related to the lock manager. Is the long-running > transaction inserting a lot of tuples (by INSERT or UPDATE) to the same table > that the seqscan scans? With a lot of different subtransaction xids. That > profile looks like the seqscan is spending a lot of time swapping pg_subtrans > pages in and out of the slru buffers. > > Increasing NUM_SUBTRANS_BUFFERS should help. A more sophisticated solution > would be to allocate slru buffers (for clog and other slru caches as well) > dynamically from shared_buffers. That's been discussed before but no-one has > gotten around to it. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com
hello ... yeah, it seems this solves the problem. i had a closer look at the SQL trace and did some more profiling. this was the case. many thanks for the quick hint. hans -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers