Re: [HACKERS] Reducing Transaction Start/End Contention
Added to TODO: * Consider transaction start/end performance improvements http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php http://archives.postgresql.org/pgsql-hackers/2008-03/msg00361.php --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD)S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for considering this is not to speed up Shared requests but to reduce the queue length and thus the waiting time for the Xclusive requestors. Each time a Shared request is dequeued, we effectively re-enable queue jumping, so a Shared request arriving during that point will actually jump ahead of Shared requests that were unlucky enough to arrive
Re: [HACKERS] Reducing Transaction Start/End Contention
Thread URL added to TODO: * SMP scalability improvements --- Paul van den Bogaard wrote: Just started a blog session on my findings running Postgres 8.3(beta) on a mid range Sun Fire server. Second entry is about the time lost on LWLock handling. When concurrency increases you can see the ProcArrayLock wait queue to start and explode. http://blogs.sun.com/paulvandenbogaard/entry/ leight_weight_lock_contention I will add more posts on all the other LWlock findings and the instrumentation method being used. Unfortunately a high priority project popped up I need to focus on. So please be patient. Hope to finish this in the first week of april. Thanks, Paul On 13-mrt-2008, at 16:56, Tom Lane wrote: Mark Mielke [EMAIL PROTECTED] writes: Alvaro Herrera wrote: How about this wording: Review Simon's claims to improve performance What sort of evidence is usually compelling? It seems to me that this sort of change only benefits configurations with dozens or more CPUs/cores? The main point in my mind was that that analysis was based on the code as it then stood. Florian's work to reduce ProcArrayLock contention might have invalidated some or all of the ideas. So it needs a fresh look. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers - Paul van den Bogaard [EMAIL PROTECTED] ISV-E -- ISV Engineering, Opensource Engineering group Sun Microsystems, Inc phone:+31 334 515 918 Saturnus 1 extentsion: x (70)15918 3824 ME Amersfoort mobile: +31 651 913 354 The Netherlands fax:+31 334 515 001 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + - Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing Transaction Start/End Contention
Just started a blog session on my findings running Postgres 8.3(beta) on a mid range Sun Fire server. Second entry is about the time lost on LWLock handling. When concurrency increases you can see the ProcArrayLock wait queue to start and explode. http://blogs.sun.com/paulvandenbogaard/entry/ leight_weight_lock_contention I will add more posts on all the other LWlock findings and the instrumentation method being used. Unfortunately a high priority project popped up I need to focus on. So please be patient. Hope to finish this in the first week of april. Thanks, Paul On 13-mrt-2008, at 16:56, Tom Lane wrote: Mark Mielke [EMAIL PROTECTED] writes: Alvaro Herrera wrote: How about this wording: Review Simon's claims to improve performance What sort of evidence is usually compelling? It seems to me that this sort of change only benefits configurations with dozens or more CPUs/cores? The main point in my mind was that that analysis was based on the code as it then stood. Florian's work to reduce ProcArrayLock contention might have invalidated some or all of the ideas. So it needs a fresh look. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers - Paul van den Bogaard [EMAIL PROTECTED] ISV-E -- ISV Engineering, Opensource Engineering group Sun Microsystems, Inc phone:+31 334 515 918 Saturnus 1 extentsion: x (70)15918 3824 ME Amersfoort mobile: +31 651 913 354 The Netherlands fax:+31 334 515 001 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing Transaction Start/End Contention
Just started a blog session on my findings running Postgres 8.3(beta) on a mid range Sun Fire server. Second entry is about the time lost on LWLock handling. When concurrency increases you can see the ProcArrayLock wait queue to start and explode. http://blogs.sun.com/paulvandenbogaard/entry/ leight_weight_lock_contention I will add more posts on all the other LWlock findings and the instrumentation method being used. Unfortunately a high priority project popped up I need to focus on. So please be patient. Hope to finish this in the first week of april. Thanks, Paul On 13-mrt-2008, at 16:56, Tom Lane wrote: Mark Mielke [EMAIL PROTECTED] writes: Alvaro Herrera wrote: How about this wording: Review Simon's claims to improve performance What sort of evidence is usually compelling? It seems to me that this sort of change only benefits configurations with dozens or more CPUs/cores? The main point in my mind was that that analysis was based on the code as it then stood. Florian's work to reduce ProcArrayLock contention might have invalidated some or all of the ideas. So it needs a fresh look. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers - Paul van den Bogaard [EMAIL PROTECTED] ISV-E -- ISV Engineering, Opensource Engineering group Sun Microsystems, Inc phone:+31 334 515 918 Saturnus 1 extentsion: x (70)15918 3824 ME Amersfoort mobile: +31 651 913 354 The Netherlands fax:+31 334 515 001 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing Transaction Start/End Contention
On Tue, 2008-03-11 at 20:23 -0400, Bruce Momjian wrote: Is this still a TODO? I think so. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing Transaction Start/End Contention
Simon Riggs wrote: On Tue, 2008-03-11 at 20:23 -0400, Bruce Momjian wrote: Is this still a TODO? I think so. How about this wording: Review Simon's claims to improve performance ;-) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing Transaction Start/End Contention
Alvaro Herrera wrote: Simon Riggs wrote: On Tue, 2008-03-11 at 20:23 -0400, Bruce Momjian wrote: Is this still a TODO? I think so. How about this wording: Review Simon's claims to improve performance What sort of evidence is usually compelling? It seems to me that this sort of change only benefits configurations with dozens or more CPUs/cores? I ask, because I saw a few references to I see no performance change - but then, I don't have the right hardware. It seems to me that it should be obvious that contention will only show up under very high concurrency? :-) Cheers, mark -- Mark Mielke [EMAIL PROTECTED]
Re: [HACKERS] Reducing Transaction Start/End Contention
Mark Mielke [EMAIL PROTECTED] writes: Alvaro Herrera wrote: How about this wording: Review Simon's claims to improve performance What sort of evidence is usually compelling? It seems to me that this sort of change only benefits configurations with dozens or more CPUs/cores? The main point in my mind was that that analysis was based on the code as it then stood. Florian's work to reduce ProcArrayLock contention might have invalidated some or all of the ideas. So it needs a fresh look. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reducing Transaction Start/End Contention
Is this still a TODO? --- Simon Riggs wrote: On Mon, 2007-07-30 at 20:20 +0100, Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() The latest patch for lazy xid allocation reduces the number of times GetNextTransactionId() is called by eliminating the call entirely for read only transactions. That will reduce the number of waits and so will for most real world cases increase the scalability of Postgres. Right-mostly workloads will be slightly less scalable, so we should expect our TPC-C numbers to be slightly worse than our TPC-E numbers. We should retest to see whether the bottleneck has been moved sufficiently to allow us to avoid doing techniques (1), (2), (3), (5) or (6) at all. 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. (4) is a general concern that remains valid. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD) S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other
Re: [HACKERS] Reducing Transaction Start/End Contention
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD)S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for considering this is not to speed up Shared requests but to reduce the queue length and thus the waiting time for the Xclusive requestors. Each time a Shared request is dequeued, we effectively re-enable queue jumping, so a Shared request arriving during that point will actually jump ahead of Shared requests that were unlucky enough to arrive while an Exclusive lock was held. Worse than that, the new incoming Shared requests exacerbate the starvation, so
Re: [HACKERS] Reducing Transaction Start/End Contention
Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold I think the work on VIDs and latestCompletedXid make this completely obsolete. --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Reducing Transaction Start/End Contention
Bruce Momjian [EMAIL PROTECTED] writes: Alvaro Herrera wrote: I think the work on VIDs and latestCompletedXid make this completely obsolete. Please confirm, all of Simon's issues? Not sure --- the area is certainly still worth looking at, but the recent patches have changed things enough that no older patches should be applied without study. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing Transaction Start/End Contention
Bruce Momjian wrote: Alvaro Herrera wrote: Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold I think the work on VIDs and latestCompletedXid make this completely obsolete. Please confirm, all of Simon's issues? http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php Hmm, in looking closer, it seems there are some things that still seem worthy of more discussion. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Reducing Transaction Start/End Contention
Alvaro Herrera wrote: Bruce Momjian wrote: This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold I think the work on VIDs and latestCompletedXid make this completely obsolete. Please confirm, all of Simon's issues? http://archives.postgresql.org/pgsql-hackers/2007-07/msg00948.php --- --- Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing Transaction Start/End Contention
On Mon, 2007-07-30 at 20:20 +0100, Simon Riggs wrote: Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() The latest patch for lazy xid allocation reduces the number of times GetNextTransactionId() is called by eliminating the call entirely for read only transactions. That will reduce the number of waits and so will for most real world cases increase the scalability of Postgres. Right-mostly workloads will be slightly less scalable, so we should expect our TPC-C numbers to be slightly worse than our TPC-E numbers. We should retest to see whether the bottleneck has been moved sufficiently to allow us to avoid doing techniques (1), (2), (3), (5) or (6) at all. 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. (4) is a general concern that remains valid. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD)S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for
Re: [HACKERS] Reducing Transaction Start/End Contention
Simon Riggs wrote: 1. Increase size of Clog-specific BLCKSZ 2. Perform ExtendClog() as a background activity (1) and (2) can be patched fairly easily for 8.3. I have a prototype patch for (1) on the shelf already from 6 months ago. Hmm, I think (1) may be 8.3 material but all the rest are complex enough that being left for 8.4 is called for. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing Transaction Start/End Contention
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Simon Riggs wrote: 1. Increase size of Clog-specific BLCKSZ 2. Perform ExtendClog() as a background activity (1) and (2) can be patched fairly easily for 8.3. I have a prototype patch for (1) on the shelf already from 6 months ago. Hmm, I think (1) may be 8.3 material but all the rest are complex enough that being left for 8.4 is called for. NONE of this is 8.3 material. Full stop. Try to keep your eyes on the ball people --- 8.3 is already months past feature freeze. yeah - we have still 12(!) open items on the PatchStatus board: http://developer.postgresql.org/index.php/Todo:PatchStatus and at least half of them are in need of reviewer capacity(and some of them there for nearly half a year). Stefan ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Reducing Transaction Start/End Contention
Jignesh Shah's scalability testing on Solaris has revealed further tuning opportunities surrounding the start and end of a transaction. Tuning that should be especially important since async commit is likely to allow much higher transaction rates than were previously possible. There is strong contention on the ProcArrayLock in Exclusive mode, with the top path being CommitTransaction(). This becomes clear as the number of connections increases, but it seems likely that the contention can be caused in a range of other circumstances. My thoughts on the causes of this contention are that the following 3 tasks contend with each other in the following way: CommitTransaction(): takes ProcArrayLock Exclusive but only needs access to one ProcArray element waits for GetSnapshotData():ProcArrayLock Shared ReadNewTransactionId():XidGenLock Shared which waits for GetNextTransactionId() takes XidGenLock Exclusive ExtendCLOG(): takes ClogControlLock Exclusive, WALInsertLock Exclusive two possible place where I/O is required ExtendSubtrans(): takes SubtransControlLock() one possible place where I/O is required Avoids lock on ProcArrayLock: atomically updates one ProcArray element or more simply: CommitTransaction() -- i.e. once per transaction waits for GetSnapshotData() -- i.e. once per SQL statement which waits for GetNextTransactionId() -- i.e. once per transaction This gives some goals for scalability improvements and some proposals. (1) and (2) are proposals for 8.3 tuning, the others are directions for further research. Goal: Reduce total time that GetSnapshotData() waits for GetNextTransactionId() 1. Increase size of Clog-specific BLCKSZ Clog currently uses BLCKSZ to define the size of clog buffers. This can be changed to use CLOG_BLCKSZ, which would then be set to 32768. This will naturally increase the amount of memory allocated to the clog, so we need not alter CLOG_BUFFERS above 8 if we do this (as previously suggested, with successful results). This will also reduce the number of ExtendClog() calls, which will probably reduce the overall contention also. 2. Perform ExtendClog() as a background activity Background process can look at the next transactionid once each cycle without holding any lock. If the xid is almost at the point where a new clog page would be allocated, then it will allocate one prior to the new page being absolutely required. Doing this as a background task would mean that we do not need to hold the XidGenLock in exclusive mode while we do this, which means that GetSnapshotData() and CommitTransaction() would also be less likely to block. Also, if any clog writes need to be performed when the page is moved forwards this would also be performed in the background. 3. Consider whether ProcArrayLock should use a new queued-shared lock mode that puts a maximum wait time on ExclusiveLock requests. It would be fairly hard to implement this well as a timer, but it might be possible to place a limit on queue length. i.e. allow Share locks to be granted immediately if a Shared holder already exists, but only if there is a queue of no more than N exclusive mode requests queued. This might prevent the worst cases of exclusive lock starvation. 4. Since shared locks are currently queued behind exclusive requests when they cannot be immediately satisfied, it might be worth reconsidering the way LWLockRelease works also. When we wake up the queue we only wake the Shared requests that are adjacent to the head of the queue. Instead we could wake *all* waiting Shared requestors. e.g. with a lock queue like this: (HEAD) S-S-X-S-X-S-X-S Currently we would wake the 1st and 2nd waiters only. If we were to wake the 3rd, 5th and 7th waiters also, then the queue would reduce in length very quickly, if we assume generally uniform service times. (If the head of the queue is X, then we wake only that one process and I'm not proposing we change that). That would mean queue jumping right? Well thats what already happens in other circumstances, so there cannot be anything intrinsically wrong with allowing it, the only question is: would it help? We need not wake the whole queue, there may be some generally more beneficial heuristic. The reason for considering this is not to speed up Shared requests but to reduce the queue length and thus the waiting time for the Xclusive requestors. Each time a Shared request is dequeued, we effectively re-enable queue jumping, so a Shared request arriving during that point will actually jump ahead of Shared requests that were unlucky enough to arrive while an Exclusive lock was held. Worse than that, the new incoming Shared requests exacerbate the starvation, so the more non-adjacent groups of Shared lock requests there are in the queue, the worse the starvation of the exclusive requestors becomes. We are effectively randomly starving some shared locks as well as exclusive locks in the current scheme, based upon the state of the lock when they make their request.