Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
On Wed, 2008-03-12 at 20:13 -0400, Bruce Momjian wrote: > Is this a TODO? Tom's reply was: The general topic, yes. The caveats still apply. > > Nonsense. Main transaction exit also takes an exclusive lock, and is > > far more likely to be exercised in typical workloads than a > > subtransaction abort. > > > > In any case: there has still not been any evidence presented by anyone > > that optimizing XidCacheRemoveRunningXids will help one bit. Given the > > difficulty of measuring any benefit from the last couple of > > optimizations in this general area, I'm thinking that such evidence > > will be hard to come by. And we have got way more than enough on our > > plates already. Can we let go of this for 8.3, please? > > --- > > Simon Riggs wrote: > > On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote: > > > > > Anyway, given that there's this one nonobvious gotcha, there might be > > > others. My recommendation is that we take this off the open-items list > > > for 8.2 and revisit it in the 8.3 cycle when there's more time. > > > > Well, its still 8.3 just... > > > > As discussed in the other thread "Final Thoughts for 8.3 on LWLocking > > and Scalability", XidCacheRemoveRunningXids() is now the only holder of > > an X lock during normal processing, so I would like to remove it. > > Here's how: > > > > Currently, we take the lock, remove the subxact and then shuffle down > > all the other subxactIds so that the subxact cache is contiguous. > > > > I propose that we simply zero out the subxact entry without re-arranging > > the cache; this will be atomic, so we need not acquire an X lock. We > > then increment ndeletedxids. When we enter a new subxact into the cache, > > if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId > > that we can use, then decrement ndeletedxids. So ndeletedxids is just a > > hint, not an absolute requirement. nxids then becomes the number of > > cache entries and never goes down until EOXact. The subxact cache is no > > longer in order, but then it doesn't need to be either. > > > > When we take a snapshot we will end up taking a copy of zeroed cache > > entries, so the snapshots will be slightly larger than previously. > > Though still no larger than the max. The size reduction was not so large > > as to make a significant difference across the whole array, so > > scalability is the main issue to resolve. > > > > The snapshots will be valid with no change, since InvalidTransactionId > > will never match against any recorded Xid. > > > > I would also like to make the size of the subxact cache configurable > > with a parameter such as subtransaction_cache_size = 64 (default), valid > > range 4-256. > > > > -- > > Simon Riggs > > 2ndQuadrant http://www.2ndQuadrant.com > > > > > > ---(end of broadcast)--- > > TIP 9: In versions below 8.0, the planner will ignore your desire to > >choose an index scan if your joining column's datatypes do not > >match > -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Is this a TODO? Tom's reply was: > Nonsense. Main transaction exit also takes an exclusive lock, and is > far more likely to be exercised in typical workloads than a > subtransaction abort. > > In any case: there has still not been any evidence presented by anyone > that optimizing XidCacheRemoveRunningXids will help one bit. Given the > difficulty of measuring any benefit from the last couple of > optimizations in this general area, I'm thinking that such evidence > will be hard to come by. And we have got way more than enough on our > plates already. Can we let go of this for 8.3, please? --- Simon Riggs wrote: > On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote: > > > Anyway, given that there's this one nonobvious gotcha, there might be > > others. My recommendation is that we take this off the open-items list > > for 8.2 and revisit it in the 8.3 cycle when there's more time. > > Well, its still 8.3 just... > > As discussed in the other thread "Final Thoughts for 8.3 on LWLocking > and Scalability", XidCacheRemoveRunningXids() is now the only holder of > an X lock during normal processing, so I would like to remove it. > Here's how: > > Currently, we take the lock, remove the subxact and then shuffle down > all the other subxactIds so that the subxact cache is contiguous. > > I propose that we simply zero out the subxact entry without re-arranging > the cache; this will be atomic, so we need not acquire an X lock. We > then increment ndeletedxids. When we enter a new subxact into the cache, > if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId > that we can use, then decrement ndeletedxids. So ndeletedxids is just a > hint, not an absolute requirement. nxids then becomes the number of > cache entries and never goes down until EOXact. The subxact cache is no > longer in order, but then it doesn't need to be either. > > When we take a snapshot we will end up taking a copy of zeroed cache > entries, so the snapshots will be slightly larger than previously. > Though still no larger than the max. The size reduction was not so large > as to make a significant difference across the whole array, so > scalability is the main issue to resolve. > > The snapshots will be valid with no change, since InvalidTransactionId > will never match against any recorded Xid. > > I would also like to make the size of the subxact cache configurable > with a parameter such as subtransaction_cache_size = 64 (default), valid > range 4-256. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
On Tue, 2007-09-11 at 09:58 -0400, Tom Lane wrote: > Can we let go of this for 8.3, please? OK, we've moved forward, so its a good place to break. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Simon Riggs <[EMAIL PROTECTED]> writes: > As discussed in the other thread "Final Thoughts for 8.3 on LWLocking > and Scalability", XidCacheRemoveRunningXids() is now the only holder of > an X lock during normal processing, Nonsense. Main transaction exit also takes an exclusive lock, and is far more likely to be exercised in typical workloads than a subtransaction abort. In any case: there has still not been any evidence presented by anyone that optimizing XidCacheRemoveRunningXids will help one bit. Given the difficulty of measuring any benefit from the last couple of optimizations in this general area, I'm thinking that such evidence will be hard to come by. And we have got way more than enough on our plates already. Can we let go of this for 8.3, please? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote: > Anyway, given that there's this one nonobvious gotcha, there might be > others. My recommendation is that we take this off the open-items list > for 8.2 and revisit it in the 8.3 cycle when there's more time. Well, its still 8.3 just... As discussed in the other thread "Final Thoughts for 8.3 on LWLocking and Scalability", XidCacheRemoveRunningXids() is now the only holder of an X lock during normal processing, so I would like to remove it. Here's how: Currently, we take the lock, remove the subxact and then shuffle down all the other subxactIds so that the subxact cache is contiguous. I propose that we simply zero out the subxact entry without re-arranging the cache; this will be atomic, so we need not acquire an X lock. We then increment ndeletedxids. When we enter a new subxact into the cache, if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId that we can use, then decrement ndeletedxids. So ndeletedxids is just a hint, not an absolute requirement. nxids then becomes the number of cache entries and never goes down until EOXact. The subxact cache is no longer in order, but then it doesn't need to be either. When we take a snapshot we will end up taking a copy of zeroed cache entries, so the snapshots will be slightly larger than previously. Though still no larger than the max. The size reduction was not so large as to make a significant difference across the whole array, so scalability is the main issue to resolve. The snapshots will be valid with no change, since InvalidTransactionId will never match against any recorded Xid. I would also like to make the size of the subxact cache configurable with a parameter such as subtransaction_cache_size = 64 (default), valid range 4-256. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Gregory Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> --- and because the entries are surely added in increasing XID order, >> such an array could be binary-searched. > If they're only added if they write to disk then isn't it possible to add them > out of order? Start a child transaction, start a child of that one and write > to disk, then exit the grandchild and write to disk in the first > child? No, because we enforce child XID > parent XID. In the case above, the child xact would be given an XID when the grandchild needs one --- see recursion in AssignSubTransactionId(). The actually slightly shaky assumption above is that children of the same parent xact must subcommit in numerical order ... but as long as we have strict nesting of subxacts I think this must be so. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Tom Lane <[EMAIL PROTECTED]> writes: > --- and because the entries are surely added in increasing XID order, > such an array could be binary-searched. If they're only added if they write to disk then isn't it possible to add them out of order? Start a child transaction, start a child of that one and write to disk, then exit the grandchild and write to disk in the first child? I'm just going on your description, I'm not familiar with this part of the code at all. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I think Theo's problem is probably somewhere else, too --- apparently >> it's not so much that TransactionIdIsCurrentTransactionId takes a long >> time as that something is calling it lots of times with no check for >> interrupt. > Could it be something like heap_lock_tuple? It calls MultiXactIdWait, > which calls GetMultXactIdMembers and TransactionIdIsCurrentTransactionId > on each member. (heap_update and heap_delete do the same thing). I > must admit I didn't read Theo's description on his scenario though. He shows HeapTupleSatisfiesSnapshot as the next thing down the call stack, so those scenarios don't seem quite right. I'm wondering about a CHECK_FOR_INTERRUPTS-free loop in either plperl or trigger handling, myself. Anyway, I was thinking some more about Theo's original suggestion that the linked-list representation of childXids was too inefficient. I'm disinclined to use a hash as he suggests, but it strikes me that we could very easily change the list into a dynamically extended array --- and because the entries are surely added in increasing XID order, such an array could be binary-searched. This wouldn't be a win for very small numbers of child XIDs, but for large numbers it would. OTOH, there are probably enough other inefficiencies in handling large numbers of subxact XIDs that speeding up TransactionIdIsCurrentTransactionId might be a useless exercise. It would be good to profile a test case before spending much effort here. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Tom Lane wrote: > I wrote: > > I see a bug though, which is that RecordSubTransactionAbort() calls > > GetCurrentTransactionId() before having verified that it needs to do > > anything. This means that we'll generate and then discard an XID > > uselessly in a failed subxact that didn't touch disk. > > Well, it would be a bug except that RecordSubTransactionAbort isn't > called unless the current subxact has an XID. Perhaps a comment would > be appropriate but there's nothing to fix here. > > I think Theo's problem is probably somewhere else, too --- apparently > it's not so much that TransactionIdIsCurrentTransactionId takes a long > time as that something is calling it lots of times with no check for > interrupt. Could it be something like heap_lock_tuple? It calls MultiXactIdWait, which calls GetMultXactIdMembers and TransactionIdIsCurrentTransactionId on each member. (heap_update and heap_delete do the same thing). I must admit I didn't read Theo's description on his scenario though. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
I wrote: > I see a bug though, which is that RecordSubTransactionAbort() calls > GetCurrentTransactionId() before having verified that it needs to do > anything. This means that we'll generate and then discard an XID > uselessly in a failed subxact that didn't touch disk. Well, it would be a bug except that RecordSubTransactionAbort isn't called unless the current subxact has an XID. Perhaps a comment would be appropriate but there's nothing to fix here. I think Theo's problem is probably somewhere else, too --- apparently it's not so much that TransactionIdIsCurrentTransactionId takes a long time as that something is calling it lots of times with no check for interrupt. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
I wrote: > Yeah, I was just looking at that. Removing useless entries from the > child-xid list would presumably help him. Considering we're not even > formally in beta yet, I'm probably being too conservative to recommend > we not touch it. Actually ... wait a minute. We do not assign an XID to a subtransaction at all unless it writes a tuple to disk (see GetCurrentTransactionId and its callers). So this whole "optimization" idea is redundant. I see a bug though, which is that RecordSubTransactionAbort() calls GetCurrentTransactionId() before having verified that it needs to do anything. This means that we'll generate and then discard an XID uselessly in a failed subxact that didn't touch disk. Worth fixing, but it doesn't look like this is Theo's problem. Unless I'm missing something, Theo's problem must involve having done tuple updates in 4.6M different subtransactions. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Gregory Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> Anyway, given that there's this one nonobvious gotcha, there might be >> others. My recommendation is that we take this off the open-items list >> for 8.2 and revisit it in the 8.3 cycle when there's more time. > I wonder if Theo's recent reported problem with 4.3M child xids changes the > calculus on this. Yeah, I was just looking at that. Removing useless entries from the child-xid list would presumably help him. Considering we're not even formally in beta yet, I'm probably being too conservative to recommend we not touch it. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Tom Lane <[EMAIL PROTECTED]> writes: > Anyway, given that there's this one nonobvious gotcha, there might be > others. My recommendation is that we take this off the open-items list > for 8.2 and revisit it in the 8.3 cycle when there's more time. I wonder if Theo's recent reported problem with 4.3M child xids changes the calculus on this. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
I wrote: > ... it seems like in the > case where RecordSubTransactionCommit detects that the subxact has not > stored its XID anywhere, we could immediately remove the XID from > the PGPROC array, just as if it had aborted. This would avoid chewing > subxid slots for cases such as exception blocks in plpgsql that are > not modifying the database, but just catching computational errors. (and later realized that Alvaro had had the same idea awhile back, but I don't have his message at hand). I looked into this a bit more; it seems like basically it should only take addition of else XidCacheRemoveRunningXids(xid, 0, NULL); to the bottom of RecordSubTransactionCommit(), plus suitable adjustment of the comments in both routines. However, there's a problem: if we delete a second-level subxact's XID from PGPROC, and later its upper subtransaction aborts, XidCacheRemoveRunningXids will emit scary warnings when it doesn't find the sub-subxact in PGPROC. This could doubtless be fixed with sufficient jiggery-pokery --- simply removing the debug warnings would be a brute-force answer, but I'd like to find something a bit less brute-force. Maybe drop the sub-subxact from its parent's list immediately, instead of carrying it forward? Anyway, given that there's this one nonobvious gotcha, there might be others. My recommendation is that we take this off the open-items list for 8.2 and revisit it in the 8.3 cycle when there's more time. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Alvaro Herrera <[EMAIL PROTECTED]> writes: > OTOH I think we only need to store live Xids and those finished that > wrote a WAL record; we can drop subaborted and subcommitted if they > didn't. While reviewing this thread, I see Alvaro already had the idea I just came to... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
ITAGAKI Takahiro <[EMAIL PROTECTED]> writes: > I added a subxid array to Snapshot and running subxids are gathered from > PGPROC->subxids cache. There are two overflowed case; any of PGPROC->subxids > are overflowed or the number of total subxids exceeds pre-allocated buffers. > If overflowed, we cannot avoid to call SubTransGetTopmostTransaction. Applied after some editorialization (you really need to pay more attention to keeping comments in sync with code ;-)) I cannot measure any consistent speed difference in plain pgbench scenarios with and without the patch, so at least as a rough approximation the extra cycles in GetSnapshotData aren't hurting. And I confirm that the test case you posted before no longer exhibits a context-swap storm. This change makes it even more obvious than before that we really want to stay out of the subxids-overflowed regime. I don't especially want to make those arrays even bigger, but I wonder if there isn't more we can do to use them efficiently. In particular, it seems like in the case where RecordSubTransactionCommit detects that the subxact has not stored its XID anywhere, we could immediately remove the XID from the PGPROC array, just as if it had aborted. This would avoid chewing subxid slots for cases such as exception blocks in plpgsql that are not modifying the database, but just catching computational errors. Comments? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Bruce Momjian <[EMAIL PROTECTED]> wrote: > Is there anything to do for 8.2 here? I'm working on Tom's idea. It is not a feature and does not change the on-disk-structures, so I hope it meet the 8.2 deadline... Tom Lane <[EMAIL PROTECTED]> wrote: > I'm wondering about doing something similar to what > TransactionIdIsInProgress does, ie, make use of the PGPROC lists > of live subtransactions. Suppose that GetSnapshotData copies not > only top xids but live subxids into the snapshot, and adds a flag > indicating whether the subxids are complete (ie, none of the subxid > lists have overflowed). Then if the flag is set, tqual.c doesn't > need to do SubTransGetTopmostTransaction() before searching the > list. I added a subxid array to Snapshot and running subxids are gathered from PGPROC->subxids cache. There are two overflowed case; any of PGPROC->subxids are overflowed or the number of total subxids exceeds pre-allocated buffers. If overflowed, we cannot avoid to call SubTransGetTopmostTransaction. I tested the patch and did not see any context switch storm which comes from pg_subtrans, but there may be some bugs in the visibility checking. It would be very nice if you could review or test the patch. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center snapshot_subtrans.patch Description: Binary data ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Is there anything to do for 8.2 here? --- ITAGAKI Takahiro wrote: > This is an additional information. > > I wrote: > > If we want to resolve the probmen fundamentally, we might have to > > improve SubTrans using a better buffer management algorithm or so. > > The above is maybe wrong. I checked each lwlock of pg_subtrans's buffers. > All lwlocks are uniformly acquired and I could not see any differences > among buffers. So the cause seems not to be a buffer management algorithm, > but just a lack of SLRU buffer pages. > > NUM_SUBTRANS_BUFFERS is defined as 32 in HEAD. If we increase it, > we can avoid the old transaction problem for a certain time. > However, it doesn't help much on high-load -- for example, on a workload > with 2000 tps, we will use up 1000 pg_subtrans pages in 15 minites. > I suppose it is not enough for online and batch/maintenance mix. > > Also, the simple scanning way in SLRU will likely cause another performance > issue when we highly increase the number of buffers. A sequential scanning > is used in SLRU, so it will not work well against many buffers. > > > I hope some cares in upper layer, snapshot, hitbits or something, > being discussed in the recent thread. > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center > > > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate >subscribe-nomail command to [EMAIL PROTECTED] so that your >message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
This is an additional information. I wrote: > If we want to resolve the probmen fundamentally, we might have to > improve SubTrans using a better buffer management algorithm or so. The above is maybe wrong. I checked each lwlock of pg_subtrans's buffers. All lwlocks are uniformly acquired and I could not see any differences among buffers. So the cause seems not to be a buffer management algorithm, but just a lack of SLRU buffer pages. NUM_SUBTRANS_BUFFERS is defined as 32 in HEAD. If we increase it, we can avoid the old transaction problem for a certain time. However, it doesn't help much on high-load -- for example, on a workload with 2000 tps, we will use up 1000 pg_subtrans pages in 15 minites. I suppose it is not enough for online and batch/maintenance mix. Also, the simple scanning way in SLRU will likely cause another performance issue when we highly increase the number of buffers. A sequential scanning is used in SLRU, so it will not work well against many buffers. I hope some cares in upper layer, snapshot, hitbits or something, being discussed in the recent thread. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Alvaro Herrera <[EMAIL PROTECTED]> writes: > I was thinking at what time was the most appropiate to insert or remove > an Xid from the cache. We can do without any excl-locking because 1) we > already assume the storing of an Xid to be atomic, and 2) no one can be > interested in querying for an Xid before the originating transaction has > had the chance to write a tuple with that Xid anyway. Actually ... that fails if GetSnapshotData is going to copy subtrans XIDs. So this area needs more thought. > On the third hand, are we going to sh-acquire the ProcArray lock while a > GetSnapshotData copies all subxact Xids of all running transactions? > ProcArrayLock will become more of a contention point than it already is. Yeah, but sharelock is better than exclusive lock ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> I'm wondering about doing something similar to what > >> TransactionIdIsInProgress does, ie, make use of the PGPROC lists > >> of live subtransactions. > > > Well, that sounds awfully more expensive than setting > > local-to-my-database Xmins as well as global (all databases) Xmins :-) > > Only when you've got a lot of subtransactions. The point of this > discussion is to optimize for the few-or-none case. In any case, > the problem with the local/global bit was that you'd be expending > foreground-process cycles without any foreground-process return. > Being able to use a snapshot without consulting pg_subtrans will > certainly buy back some cycles... I can buy that. Some idle thoughts: I was thinking at what time was the most appropiate to insert or remove an Xid from the cache. We can do without any excl-locking because 1) we already assume the storing of an Xid to be atomic, and 2) no one can be interested in querying for an Xid before the originating transaction has had the chance to write a tuple with that Xid anyway. OTOH I think we only need to store live Xids and those finished that wrote a WAL record; we can drop subaborted and subcommitted if they didn't. On the third hand, are we going to sh-acquire the ProcArray lock while a GetSnapshotData copies all subxact Xids of all running transactions? ProcArrayLock will become more of a contention point than it already is. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I'm wondering about doing something similar to what >> TransactionIdIsInProgress does, ie, make use of the PGPROC lists >> of live subtransactions. > Well, that sounds awfully more expensive than setting > local-to-my-database Xmins as well as global (all databases) Xmins :-) Only when you've got a lot of subtransactions. The point of this discussion is to optimize for the few-or-none case. In any case, the problem with the local/global bit was that you'd be expending foreground-process cycles without any foreground-process return. Being able to use a snapshot without consulting pg_subtrans will certainly buy back some cycles... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Tom Lane wrote: > I'm wondering about doing something similar to what > TransactionIdIsInProgress does, ie, make use of the PGPROC lists > of live subtransactions. Suppose that GetSnapshotData copies not > only top xids but live subxids into the snapshot, and adds a flag > indicating whether the subxids are complete (ie, none of the subxid > lists have overflowed). Then if the flag is set, tqual.c doesn't > need to do SubTransGetTopmostTransaction() before searching the > list. Well, that sounds awfully more expensive than setting local-to-my-database Xmins as well as global (all databases) Xmins :-) On the other hand, ISTM as soon as one cache overflows, you have to go check pg_subtrans which means the entire optimization buys nothing in that case. It would be nice if the optimization degraded more gracefully. I don't have any concrete suggestion though. The changes proposed in the other CS-storm thread by the NTT person may help. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
ITAGAKI Takahiro <[EMAIL PROTECTED]> writes: > The invokers of SubTrans module are two SubTransGetTopmostTransaction() > in HeapTupleSatisfiesSnapshot(). When I disabled the calls, CSStorm did > not occur. SubTransGetTopmostTransaction returns the argument without > change when we don't use SAVEPOINTs. > If we optimize for non-subtransactions, we can avoid to lock SubTrans > for check visiblities of tuples inserted by top transactions. Only for top transactions still in progress, so I doubt that would help much. I'm wondering about doing something similar to what TransactionIdIsInProgress does, ie, make use of the PGPROC lists of live subtransactions. Suppose that GetSnapshotData copies not only top xids but live subxids into the snapshot, and adds a flag indicating whether the subxids are complete (ie, none of the subxid lists have overflowed). Then if the flag is set, tqual.c doesn't need to do SubTransGetTopmostTransaction() before searching the list. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] CSStorm occurred again by postgreSQL8.2
Tom Lane <[EMAIL PROTECTED]> wrote: > > It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS. > > The problem was only postponed. > > Can you provide a reproducible test case for this? This is the reproducible test case: - Occurs on both 8.1.4 and HEAD. - On smp machine. I used dual opterons. CSStrom becomes worse on dual xeon with hyper-threading. - Tuning parameters are default. Whole data are cached in shared buffers. (shared_buffers=32MB, data of pgbench (scale=1) are less than 15MB.) - Using custom pgbench. One client doing UPDATE with indexscan and multiple clients doing SELECT with seqscan/indexscan. $ pgbench -i $ pgbench -n -c 1 -t 10 -f cs_update.sql& $ pgbench -n -c 50 -t 10 -f cs_indexscan.sql & $ pgbench -n -c 35 -t 10 -f cs_seqscan.sql & (The scripts are attached at end of this message.) In above workload, context switches are 2000-1/sec and cpu usage is user=100%. Then, start a long open transaction on another connection. $ psql # begin; -- Long open transaction After a lapse of 30-60 seconds, context switches become 5/sec over (12 over on xeons) and cpu usage is user=66% / sys=21% / idle=13%. If we increase the frequency of UPDATE, the duration becomes shorter. This is a human-induced workload, but I can see the same condition in TPC-W -- even though it is a benchmark. TPC-W requires full-text search and it is implementd using "LIKE %foo%" in my implementation (DBT-1, too). Also, it requires periodical aggregations. They might behave as long transactions. The cause seems to be a lock contention. The number of locks on SubtransControlLock and SubTransBuffer are significantly increased by comparison with BufMappingLocks. # Before starting a long transaction. kind |lwlock | sh_call | sh_wait | ex_call | ex_wait --+-+--+-+-+- 13 | SubtransControlLock |28716 | 2 | 54 | 0 22 | BufMappingLock | 11637884 | 0 |2492 | 0 27 | SubTransBuffer |0 | 0 | 11 | 0 # After kind |lwlock | sh_call | sh_wait | ex_call | ex_wait --+-+--+-+-+- 13 | SubtransControlLock | 4139111 | 65059 | 3926691 | 390838 22 | BufMappingLock | 32348073 | 0 |2509 | 0 27 | SubTransBuffer | 939646 | 960341 | 1419152 | 61 The invokers of SubTrans module are two SubTransGetTopmostTransaction() in HeapTupleSatisfiesSnapshot(). When I disabled the calls, CSStorm did not occur. SubTransGetTopmostTransaction returns the argument without change when we don't use SAVEPOINTs. If we optimize for non-subtransactions, we can avoid to lock SubTrans for check visiblities of tuples inserted by top transactions. If we want to resolve the probmen fundamentally, we might have to improve SubTrans using a better buffer management algorithm or so. Do you have any idea to avoid such a problem? -- cs_update.sql \set naccounts 10 * :tps \setrandom aid 1 :naccounts \setrandom delta -5000 5000 UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid; SELECT pg_sleep(0.1); -- cs_seqscan.sql \set naccounts 10 * :tps \setrandom aid 1 :naccounts SELECT abalance FROM accounts WHERE aid::int8 = :aid; -- cast to force seqscan -- cs_indexscan.sql \set naccounts 10 * :tps \setrandom aid 1 :naccounts SELECT abalance FROM accounts WHERE aid = :aid; Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings