Re: [HACKERS] Planning counters in pg_stat_statements
I am not actively working on this now, but I'll come back to it for PG12 if you or Lukas don't beat me to it, and I'll help/test/review if I you do. It seems there is plenty of demand for the feature and I'll be very happy to see it. Good to know, thanks! I'd argue that it might be better to add a new argument to pg_plan_query() and pg_plan_queries() and a new field to QueryDesc, i.e.: PlannedStmt * pg_plan_query(Query *querytree, int cursorOptions, ParamListInfo boundParams, double *planningTime) List * pg_plan_queries(List *querytrees, int cursorOptions, ParamListInfo boundParams, double *planningTime) /* total time as in BuildCachedPlan() */ The measured time can later be passed to QueryDesc via PortalDefineQuery(). Of course, this requires more changes, but the result might be worth it. What do you think? So who does the actual timing work in this model? Does the core code do the timing, but it'd be disabled by default because NULL is usually passed in, and you need to register an extension that provides a place to stick the result in order to turn on the time-measuring code? If you mean that it's still the extension that does the timing, it seems strange to have struct members and parameter arguments for something specific that the core code doesn't understand. Right, I guess my proposal was too brief. Yes, my idea's that the core code should do the timing, which might be disabled by passing NULL to those functions. However, given that there's lots of people out there who use pg_stat_statements on a regular basis, it might be meaningful to just turn it on unconditionally (my assumptions might be wrong, of course). Alternatively, we could introduce a new GUC variable to disable this feature. The extensions would no longer have to do the accounting. As a more general version of that, I wondered about having some kind of associative data structure (hash table, assoc list, etc) that would somehow travel everywhere with the PlannedStmt, but not inside it, just for the use of extensions. Then planning time could be stored in there by a planner hook, and the fished out by any other hook that knows the same key (not sure how you manage that key space but I'm sure you could come up with something). That could have uses for other extensions too, and could also be used for the "query ID", which is, after all, the model that the abandoned planning_time member was following. Just a thought. Perhaps we could change planner() so that it would return pointer to some struct holding PlannedStmt and a List of some nodes or structs for accounting. -- Dmitry Ivanov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Planning counters in pg_stat_statements
Hi Dmitry, On Tue, May 15, 2018 at 11:10 PM, Dmitry Ivanovwrote: > Is anybody still working on this? Are there any plans to add this to > commitfest? I am not actively working on this now, but I'll come back to it for PG12 if you or Lukas don't beat me to it, and I'll help/test/review if I you do. It seems there is plenty of demand for the feature and I'll be very happy to see it. > I'd like to add planning time to auto_explain, and it turns out that this > patch is somewhat relevant to that feature. Sounds very useful. > The current approach here is to set planning_time in PlannedStmt via > planner_hook, which (in my opinion) has several flaws: > > 1. Depending on the order of extensions in shared_preload_libraries, it > might not measure time spent on preceding planner hooks. > > 2. Provided that there are multiple users of this metric, it might become a > little too costy to register several hooks with identical purpose. > > 3. [Bikeshedding] Although planning time is stored in PlannedStmt, it's > definitely not an inherent property of a plan. You could have two machines > with identical settings but quite different planning times due to various > circumstances (raw CPU power, I/O etc). Yeah. Putting that inside the PlannedStmt wasn't very nice. Another thing I tried was to have some opaque extension workspace, but that runs into a number of technical problem. That idea is definitely dead. > I'd argue that it might be better to add a new argument to pg_plan_query() > and pg_plan_queries() and a new field to QueryDesc, i.e.: > > PlannedStmt * > pg_plan_query(Query *querytree, > int cursorOptions, > ParamListInfo boundParams, > double *planningTime) > > List * > pg_plan_queries(List *querytrees, > int cursorOptions, > ParamListInfo boundParams, > double *planningTime) /* total time as in > BuildCachedPlan() */ > > The measured time can later be passed to QueryDesc via PortalDefineQuery(). > Of course, this requires more changes, but the result might be worth it. > > What do you think? So who does the actual timing work in this model? Does the core code do the timing, but it'd be disabled by default because NULL is usually passed in, and you need to register an extension that provides a place to stick the result in order to turn on the time-measuring code? If you mean that it's still the extension that does the timing, it seems strange to have struct members and parameter arguments for something specific that the core code doesn't understand. As a more general version of that, I wondered about having some kind of associative data structure (hash table, assoc list, etc) that would somehow travel everywhere with the PlannedStmt, but not inside it, just for the use of extensions. Then planning time could be stored in there by a planner hook, and the fished out by any other hook that knows the same key (not sure how you manage that key space but I'm sure you could come up with something). That could have uses for other extensions too, and could also be used for the "query ID", which is, after all, the model that the abandoned planning_time member was following. Just a thought. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Planning counters in pg_stat_statements
Hi, Is anybody still working on this? Are there any plans to add this to commitfest? I'd like to add planning time to auto_explain, and it turns out that this patch is somewhat relevant to that feature. The current approach here is to set planning_time in PlannedStmt via planner_hook, which (in my opinion) has several flaws: 1. Depending on the order of extensions in shared_preload_libraries, it might not measure time spent on preceding planner hooks. 2. Provided that there are multiple users of this metric, it might become a little too costy to register several hooks with identical purpose. 3. [Bikeshedding] Although planning time is stored in PlannedStmt, it's definitely not an inherent property of a plan. You could have two machines with identical settings but quite different planning times due to various circumstances (raw CPU power, I/O etc). I'd argue that it might be better to add a new argument to pg_plan_query() and pg_plan_queries() and a new field to QueryDesc, i.e.: PlannedStmt * pg_plan_query(Query *querytree, int cursorOptions, ParamListInfo boundParams, double *planningTime) List * pg_plan_queries(List *querytrees, int cursorOptions, ParamListInfo boundParams, double *planningTime) /* total time as in BuildCachedPlan() */ The measured time can later be passed to QueryDesc via PortalDefineQuery(). Of course, this requires more changes, but the result might be worth it. What do you think? -- Dmitry Ivanov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Planning counters in pg_stat_statements
On Sat, Mar 31, 2018 at 8:04 AM, Lukas Fittlwrote: > On Tue, Jan 23, 2018 at 3:31 PM, Tom Lane wrote: >> >> > Unfortunately I'm not going to have bandwidth to figure this out >> > during this commitfest due to other priorities so I'm going to call >> > this "returned with feedback". >> >> OK. There's still time to get it done in the March 'fest. > > > I've had an interesting incident earlier this week where seeing planning > time in pg_stat_statements would have been very helpful to determine the > root cause more quickly. > > Essentially the situation was a statement that executed < 1ms but took more > than 250ms to plan, running often - once we found the statement (using > pg_stat_activity sampling) we were able to fix quickly by rewriting the > query and reduce system load from ~8 to ~1. Having this patch in > pg_stat_statements would have been critical to get the full picture of what > was going on earlier. > > Thomas: I'm not as familiar with planner internals as you are, but happy to > try and contribute here. Would it be useful for me to try to adjust the > patch to Tom's proposal? Hi Lukas, Please feel free! I'm sure this will be a useful feature. Unfortunately I wasn't able to get back to it in time for PG11. It did make an attempt though, and got stuck on some confusion about whether you want the PREPARE or the "real" statement when using prepared plans; I don't recall the details now. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Planning counters in pg_stat_statements
On Tue, Jan 23, 2018 at 3:31 PM, Tom Lanewrote: > > > Unfortunately I'm not going to have bandwidth to figure this out > > during this commitfest due to other priorities so I'm going to call > > this "returned with feedback". > > OK. There's still time to get it done in the March 'fest. > I've had an interesting incident earlier this week where seeing planning time in pg_stat_statements would have been very helpful to determine the root cause more quickly. Essentially the situation was a statement that executed < 1ms but took more than 250ms to plan, running often - once we found the statement (using pg_stat_activity sampling) we were able to fix quickly by rewriting the query and reduce system load from ~8 to ~1. Having this patch in pg_stat_statements would have been critical to get the full picture of what was going on earlier. Thomas: I'm not as familiar with planner internals as you are, but happy to try and contribute here. Would it be useful for me to try to adjust the patch to Tom's proposal? Best, Lukas -- Lukas Fittl
Re: [HACKERS] Planning counters in pg_stat_statements
Thomas Munrowrites: > One problem is that pgss_planner_hook doesn't have the source query > text. That problem could be solved by adding a query_string argument > to the planner hook function type and also planner(), > standard_planner(), pg_plan_query(), pg_plan_queries(). I don't know > if that change would be acceptable or if there is a better way that > doesn't change extern functions that will annoy extension owners. Within the planner, I'd be inclined to store the query string pointer in PlannerGlobal (which is accessible from PlannerInfo "root"), but I'm not sure how many of the functions you mention would still need an explicit signature change. Anyway that doesn't particularly bother me --- it's something that might need to happen anyway, if we ever hope to throw errors with location pointers from inside the planner. > Something I wondered about but didn't try: we could skip the above > problem AND the extra pgss_store() by instead pushing (query ID, > planning time) into a backend-private buffer and then later pushing it > into shmem when we eventually call pgss_store() for the execution > counters. Meh. Part of the reason I don't like what you submitted before is that it supposes there's a mostly one-to-one relationship between planner calls and executor calls; which there is not, when you start considering edge cases like prepared statements. A global would make that worse. > Unfortunately I'm not going to have bandwidth to figure this out > during this commitfest due to other priorities so I'm going to call > this "returned with feedback". OK. There's still time to get it done in the March 'fest. regards, tom lane
Re: [HACKERS] Planning counters in pg_stat_statements
On Fri, Jan 19, 2018 at 12:14 AM, Thomas Munrowrote: > On Sat, Jan 13, 2018 at 2:16 PM, Tom Lane wrote: >> What we could/should do instead, I think, is have pgss_planner_hook >> make its own pgss_store() call to log the planning time results >> (which would mean we don't need the added PlannedStmt field at all). >> That would increase the overhead of this feature, which might mean >> that it'd be worth having a pg_stat_statements GUC to enable it. >> But it'd be a whole lot cleaner. > > Ok. It seems a bit strange to have to go through the locking twice, > but I don't have a better idea. First attempt seems to be working. One problem is that pgss_planner_hook doesn't have the source query text. That problem could be solved by adding a query_string argument to the planner hook function type and also planner(), standard_planner(), pg_plan_query(), pg_plan_queries(). I don't know if that change would be acceptable or if there is a better way that doesn't change extern functions that will annoy extension owners. When I tried that it basically worked except for a problem with "PREPARE ... " (what we want to show) vs "" (what my experimental patch now shows -- oops). Something I wondered about but didn't try: we could skip the above problem AND the extra pgss_store() by instead pushing (query ID, planning time) into a backend-private buffer and then later pushing it into shmem when we eventually call pgss_store() for the execution counters. If you think of that as using global variables to pass state between functions just because we didn't structure our program right it sounds terrible, but if you think of it as a dtrace-style per-thread tracing event buffer that improves performance by batching up collected data, it sounds great :-D Unfortunately I'm not going to have bandwidth to figure this out during this commitfest due to other priorities so I'm going to call this "returned with feedback". -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Planning counters in pg_stat_statements
On Sat, Jan 13, 2018 at 2:16 PM, Tom Lanewrote: > I wrote: >> Haribabu Kommi writes: >>> I checked the latest patch and it is working fine and I don't have any >>> further comments. Marked the patch as "ready for committer". > >> I started to look at this patch, Thanks! > ... looking further, I'm really seriously unhappy about this bit: > > @@ -943,9 +1006,16 @@ pgss_ExecutorEnd(QueryDesc *queryDesc) > ... > + > + /* > +* Clear planning_time, so that we only count it once for each > +* replanning of a prepared statement. > +*/ > + queryDesc->plannedstmt->planning_time = 0; > } > > What we have here is pgss_ExecutorEnd stomping on the plan cache. > I do *not* find that acceptable. At the very least, it ruins the > theory that this field is a shared resource. The idea was that planning_time is work space for extensions to do whatever they like with, but objection noted. > What we could/should do instead, I think, is have pgss_planner_hook > make its own pgss_store() call to log the planning time results > (which would mean we don't need the added PlannedStmt field at all). > That would increase the overhead of this feature, which might mean > that it'd be worth having a pg_stat_statements GUC to enable it. > But it'd be a whole lot cleaner. Ok. It seems a bit strange to have to go through the locking twice, but I don't have a better idea. First attempt seems to be working. I'll post an updated patch in a couple of days when I'm back from travelling and have a tidier version with a new GUC and have thought about the nested_level question. -- Thomas Munro http://www.enterprisedb.com
Re: [HACKERS] Planning counters in pg_stat_statements
I wrote: > Haribabu Kommiwrites: >> I checked the latest patch and it is working fine and I don't have any >> further comments. Marked the patch as "ready for committer". > I started to look at this patch, ... looking further, I'm really seriously unhappy about this bit: @@ -943,9 +1006,16 @@ pgss_ExecutorEnd(QueryDesc *queryDesc) ... + + /* +* Clear planning_time, so that we only count it once for each +* replanning of a prepared statement. +*/ + queryDesc->plannedstmt->planning_time = 0; } What we have here is pgss_ExecutorEnd stomping on the plan cache. I do *not* find that acceptable. At the very least, it ruins the theory that this field is a shared resource. What we could/should do instead, I think, is have pgss_planner_hook make its own pgss_store() call to log the planning time results (which would mean we don't need the added PlannedStmt field at all). That would increase the overhead of this feature, which might mean that it'd be worth having a pg_stat_statements GUC to enable it. But it'd be a whole lot cleaner. regards, tom lane
Re: [HACKERS] Planning counters in pg_stat_statements
Haribabu Kommiwrites: > I checked the latest patch and it is working fine and I don't have any > further comments. Marked the patch as "ready for committer". I started to look at this patch, and I'm not entirely convinced whether it is a good thing for the planner_hook to bump nested_level. ISTM that will change behavior in some ways, in particular when the planner chooses to evaluate an immutable or stable function that runs the executor (probably via SPI). Before, that execution would have been regarded as a top-level call, now it will not be. Maybe that's fine, but did anyone think hard about it? A possible alternative behavior is for planner_hook to maintain its own nesting depth counter, separate from the one for execution nesting depth. I'm not sure if that's better or not. Discuss ... regards, tom lane
Re: [HACKERS] Planning counters in pg_stat_statements
On Thu, Jan 11, 2018 at 10:00 PM, Thomas Munro < thomas.mu...@enterprisedb.com> wrote: > On Thu, Jan 11, 2018 at 7:36 PM, Haribabu Kommi >wrote: > > > +OUT plans int8, > > > > Addition of this column is good to find out how many time the plan is > > generated > > for the same query. But I am thinking "plans" column name may confuse? > > What else would you call that? It's the number of plans that have > been created. It sits nicely beside "calls" don't you think? I could > change it to "plan_count" but that's longer and not like "calls". > OK. I checked the latest patch and it is working fine and I don't have any further comments. Marked the patch as "ready for committer". Regards, Hari Babu Fujitsu Australia
Re: [HACKERS] Planning counters in pg_stat_statements
On Tue, Nov 7, 2017 at 4:10 PM, Thomas Munrowrote: > Hi hackers, > > I have often wanted $SUBJECT and was happy to find that Fujii-san had > posted a patch five years ago[1]. The reception then seemed positive. > So here is a refurbished and (hopefully) improved version of his patch > with a new column for the replan count. Thoughts? > Thanks for the patch. -DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \ +DATA = pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \ The current version of the pg_stat_statements is already 1.5, this patch should increase it to 1.6. +/* contrib/pg_stat_statements/pg_stat_statements--1.5.sql */ Add the contents of the pg_stat_statements--1.4--1.5.sql file here, otherwise when the user installs this as a new extension, he will loose those changes. +OUT plans int8, Addition of this column is good to find out how many time the plan is generated for the same query. But I am thinking "plans" column name may confuse? +OUT plan_time float8, I am fine with the current name also, just checking how about "planning_time" similar like EXPLAIN? + + plan_time + double precision + + Total time spent planning, in milliseconds + How about "Total time spent in the statement planning"? And also it needs an update for the total_time column as "Total time spent in the statement execution". + /* * PlannedStmt node * Extra change. Regards, Hari Babu Fujitsu Australia