Re: [HACKERS] Nested Wait Events?
On Mon, Dec 12, 2016 at 2:42 PM, Simon Riggswrote: > There's too many "I"s in that para. I've not presented this as a > defect, nor is there any reason to believe this post is aimed at you > personally. Well, actually, there is. You said in your original post that something was "not correct" and something else was "not handled". That sounds like a description of a defect to me. If that wasn't how you meant it, fine. > I'm letting Hackers know that I've come across two problems and I see > more. I'm good with accepting reduced scope in return for performance, > but we should be allowed to discuss what limitations that imposes > without rancour. I'm not mad. I thought you were. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Nested Wait Events?
On 12 December 2016 at 18:05, Robert Haaswrote: > On Mon, Dec 12, 2016 at 12:16 PM, Simon Riggs wrote: >> On 12 December 2016 at 16:52, Robert Haas wrote: >>> On Mon, Dec 12, 2016 at 11:33 AM, Simon Riggs wrote: Last week I noticed that the Wait Event/Locks system doesn't correctly describe waits for tuple locks because in some cases that happens in two stages. >>> >>> Well, I replied to that email to say that I didn't agree with your >>> analysis. I think if something happens in two stages, those wait >>> events should be distinguished. The whole point here is to get >>> clarity on what the system is waiting for, and we lose that if we >>> start trying to merge together things which are at the code level >>> separate. >> >> Clarity is what we are both looking for then. > > Granted. > >> I know I am waiting for a tuple lock. You want information about all >> the lower levels. I'm good with that as long as the lower information >> is somehow recorded against the higher level task, which it wouldn't >> be in either of the cases I mention, hence why I bring it up again. > > So, I think that this may be a case where I built an apple and you are > complaining that it's not an orange. I had very clearly in mind from > the beginning of the wait event work that we were trying to expose > low-level information about what the system was doing, and I advocated > for this design as a way of doing that, I think, reasonably well. The > statement that you want information about what is going on at a higher > level is fair, but IMHO it's NOT fair to present that as a defect in > what's been committed. It was never intended to do that, at least not > by me, and I committed all of the relevant patches and had a fair > amount of involvement with the design. You may think I should have > been trying to solve a different problem and you may even be right, > but that is a separate issue from how well I did at solving the > problem I was attempting to solve. There's too many "I"s in that para. I've not presented this as a defect, nor is there any reason to believe this post is aimed at you personally. I'm letting Hackers know that I've come across two problems and I see more. I'm good with accepting reduced scope in return for performance, but we should be allowed to discuss what limitations that imposes without rancour. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Nested Wait Events?
On Mon, Dec 12, 2016 at 12:16 PM, Simon Riggswrote: > On 12 December 2016 at 16:52, Robert Haas wrote: >> On Mon, Dec 12, 2016 at 11:33 AM, Simon Riggs wrote: >>> Last week I noticed that the Wait Event/Locks system doesn't correctly >>> describe waits for tuple locks because in some cases that happens in >>> two stages. >> >> Well, I replied to that email to say that I didn't agree with your >> analysis. I think if something happens in two stages, those wait >> events should be distinguished. The whole point here is to get >> clarity on what the system is waiting for, and we lose that if we >> start trying to merge together things which are at the code level >> separate. > > Clarity is what we are both looking for then. Granted. > I know I am waiting for a tuple lock. You want information about all > the lower levels. I'm good with that as long as the lower information > is somehow recorded against the higher level task, which it wouldn't > be in either of the cases I mention, hence why I bring it up again. So, I think that this may be a case where I built an apple and you are complaining that it's not an orange. I had very clearly in mind from the beginning of the wait event work that we were trying to expose low-level information about what the system was doing, and I advocated for this design as a way of doing that, I think, reasonably well. The statement that you want information about what is going on at a higher level is fair, but IMHO it's NOT fair to present that as a defect in what's been committed. It was never intended to do that, at least not by me, and I committed all of the relevant patches and had a fair amount of involvement with the design. You may think I should have been trying to solve a different problem and you may even be right, but that is a separate issue from how well I did at solving the problem I was attempting to solve. There was quite a lot of discussion 9-12 months ago (IIRC) about wanting additional detail to be associated with wait events. From what I understand, Oracle will not only report that it waited for a block to be read but also tells you for which block it was waiting, and some of the folks at Postgres Pro were advocating for the wait event facility to do something similar. I strongly resisted that kind of additional detail, because what makes the current system fast and low-impact, and therefore able to be on by default, is that all it does is one unsynchronized 4-byte write into shared memory. If we do anything more than that -- say 8 bytes, let alone the extra 20 bytes we'd need to store a relfilenode -- we're going to need to insert memory barriers in the path that updates the data in order to make sure that it can be read without tearing, and I'm afraid that's going to have a noticeable performance impact. Certainly, we'd need to check into that very carefully before doing it. Operations like reading a block or blocking on an LWLock are heavier than a couple of memory barriers, but they're not necessarily so much heavier that we can afford to throw extra memory barriers in those paths without any impact. Now, some of what you want to do here may be able to be done without making wait_event_info any wider than uint32, and to the extent that's possible without too much contortion I am fine with it. If you want to know that a tuple lock was being sought for an update rather than a delete, that could probably be exposed. But if you want to know WHICH tuple or even WHICH relation was affected, this mechanism isn't well-suited to that task. I think we may well want to add some new mechanism that reports those sorts of things, but THIS mechanism doesn't have the bit-space for it and isn't designed to do it. It's designed to give basic information and be so cheap that we can use it practically everywhere. For more detailed reporting, we should probably have facilities that are not turned on by default, or else facilities that are limited to cases where the volume can never be very high. You don't have to add a lot of overhead to cause a problem in a code path that executes tens of thousands of times per second per backend. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Nested Wait Events?
On 12 December 2016 at 16:52, Robert Haaswrote: > On Mon, Dec 12, 2016 at 11:33 AM, Simon Riggs wrote: >> Last week I noticed that the Wait Event/Locks system doesn't correctly >> describe waits for tuple locks because in some cases that happens in >> two stages. > > Well, I replied to that email to say that I didn't agree with your > analysis. I think if something happens in two stages, those wait > events should be distinguished. The whole point here is to get > clarity on what the system is waiting for, and we lose that if we > start trying to merge together things which are at the code level > separate. Clarity is what we are both looking for then. I know I am waiting for a tuple lock. You want information about all the lower levels. I'm good with that as long as the lower information is somehow recorded against the higher level task, which it wouldn't be in either of the cases I mention, hence why I bring it up again. Same thing occurs in any case where we wait for multiple lwlocks. "I had to buy a mop so I could clean the toilets" is potentially important information, but I would prefer to start at the intention side. So that "cleaning the toilets" shows up as the intent, which might consist of multiple sub-tasks. We can then investigate why sometimes cleaning the toilet takes one flush and other times it involves a shopping trip to get a mop. If "mop purchase" is not correctly associated with cleaning then we don't notice what is going on and cannot do anything useful with the info. Regrettably, it's an accounting problem not a database problem and we need a chart of accounts hierarchy to solve it. (e.g. bill of materials). -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Nested Wait Events?
On Mon, Dec 12, 2016 at 11:33 AM, Simon Riggswrote: > Last week I noticed that the Wait Event/Locks system doesn't correctly > describe waits for tuple locks because in some cases that happens in > two stages. Well, I replied to that email to say that I didn't agree with your analysis. I think if something happens in two stages, those wait events should be distinguished. The whole point here is to get clarity on what the system is waiting for, and we lose that if we start trying to merge together things which are at the code level separate. > Now I notice that the Wait Event system doesn't handle waiting for > recovery conflicts at all, though it does access ProcArrayLock > multiple times. This isn't a very clear statement. Every place in the system that can provoke a wait on a latch or a process semaphore display some kind of wait event in pg_stat_activity. Some of those displays may not be as clear or detailed as you would like and that's fine, but saying they are not handled is not exactly true. > Don't have a concrete proposal, but I think we need a more complex > model for how we record wait event data. Something that separates > intention (e.g. "Travelling to St.Ives") from current event (e.g. > "Waiting for LWLock") That's not a bad thought. We need to be careful to keep this very lightweight so that it doesn't affect performance, but the general concept of separating intention from current event might have some legs. We just need to be careful that it doesn't involve into something that involves a lot of complicated bookkeeping, because these wait events can occur very frequently and in hot code-paths. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Nested Wait Events?
Last week I noticed that the Wait Event/Locks system doesn't correctly describe waits for tuple locks because in some cases that happens in two stages. Now I notice that the Wait Event system doesn't handle waiting for recovery conflicts at all, though it does access ProcArrayLock multiple times. In both cases I tried to fix the problem before mentioning it here. We can't add waits for either of those in a simple way because the current system doesn't allow us to report multiple levels of wait. In both these cases there is a single "top level wait" i.e. tuple locking or recovery conflicts, even if there are other waits that form part of the total wait. I'm guessing that there are other situations like this also. Don't have a concrete proposal, but I think we need a more complex model for how we record wait event data. Something that separates intention (e.g. "Travelling to St.Ives") from current event (e.g. "Waiting for LWLock") -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers