Re: [HACKERS] Planning counters in pg_stat_statements

2018-05-16 Thread Dmitry Ivanov

I am not actively working on this now, but I'll come back to it for
PG12 if you or Lukas don't beat me to it, and I'll help/test/review if
I you do.  It seems there is plenty of demand for the feature and I'll
be very happy to see it.


Good to know, thanks!

I'd argue that it might be better to add a new argument to 
pg_plan_query()

and pg_plan_queries() and a new field to QueryDesc, i.e.:

PlannedStmt *
pg_plan_query(Query *querytree,
  int cursorOptions,
  ParamListInfo boundParams,
  double *planningTime)

List *
pg_plan_queries(List *querytrees,
int cursorOptions,
ParamListInfo boundParams,
double *planningTime) /* total time as 
in

BuildCachedPlan() */

The measured time can later be passed to QueryDesc via 
PortalDefineQuery().
Of course, this requires more changes, but the result might be worth 
it.


What do you think?


So who does the actual timing work in this model?  Does the core code
do the timing, but it'd be disabled by default because NULL is usually
passed in, and you need to register an extension that provides a place
to stick the result in order to turn on the time-measuring code?  If
you mean that it's still the extension that does the timing, it seems
strange to have struct members and parameter arguments for something
specific that the core code doesn't understand.


Right, I guess my proposal was too brief. Yes, my idea's that the core 
code

should do the timing, which might be disabled by passing NULL to those
functions. However, given that there's lots of people out there who use
pg_stat_statements on a regular basis, it might be meaningful to just
turn it on unconditionally (my assumptions might be wrong, of course).
Alternatively, we could introduce a new GUC variable to disable this 
feature.


The extensions would no longer have to do the accounting.


As a more general version of that, I wondered about having some kind
of associative data structure (hash table, assoc list, etc) that would
somehow travel everywhere with the PlannedStmt, but not inside it,
just for the use of extensions.  Then planning time could be stored in
there by a planner hook, and the fished out by any other hook that
knows the same key (not sure how you manage that key space but I'm
sure you could come up with something).  That could have uses for
other extensions too, and could also be used for the "query ID", which
is, after all, the model that the abandoned planning_time member was
following.  Just a thought.


Perhaps we could change planner() so that it would return pointer to 
some
struct holding PlannedStmt and a List of some nodes or structs for 
accounting.


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Planning counters in pg_stat_statements

2018-05-15 Thread Thomas Munro
Hi Dmitry,

On Tue, May 15, 2018 at 11:10 PM, Dmitry Ivanov  wrote:
> Is anybody still working on this? Are there any plans to add this to
> commitfest?

I am not actively working on this now, but I'll come back to it for
PG12 if you or Lukas don't beat me to it, and I'll help/test/review if
I you do.  It seems there is plenty of demand for the feature and I'll
be very happy to see it.

> I'd like to add planning time to auto_explain, and it turns out that this
> patch is somewhat relevant to that feature.

Sounds very useful.

> The current approach here is to set planning_time in PlannedStmt via
> planner_hook, which (in my opinion) has several flaws:
>
> 1. Depending on the order of extensions in shared_preload_libraries, it
> might not measure time spent on preceding planner hooks.
>
> 2. Provided that there are multiple users of this metric, it might become a
> little too costy to register several hooks with identical purpose.
>
> 3. [Bikeshedding] Although planning time is stored in PlannedStmt, it's
> definitely not an inherent property of a plan. You could have two machines
> with identical settings but quite different planning times due to various
> circumstances (raw CPU power, I/O etc).

Yeah.  Putting that inside the PlannedStmt wasn't very nice.  Another
thing I tried was to have some opaque extension workspace, but that
runs into a number of technical problem.  That idea is definitely
dead.

> I'd argue that it might be better to add a new argument to pg_plan_query()
> and pg_plan_queries() and a new field to QueryDesc, i.e.:
>
> PlannedStmt *
> pg_plan_query(Query *querytree,
>   int cursorOptions,
>   ParamListInfo boundParams,
>   double *planningTime)
>
> List *
> pg_plan_queries(List *querytrees,
> int cursorOptions,
> ParamListInfo boundParams,
> double *planningTime) /* total time as in
> BuildCachedPlan() */
>
> The measured time can later be passed to QueryDesc via PortalDefineQuery().
> Of course, this requires more changes, but the result might be worth it.
>
> What do you think?

So who does the actual timing work in this model?  Does the core code
do the timing, but it'd be disabled by default because NULL is usually
passed in, and you need to register an extension that provides a place
to stick the result in order to turn on the time-measuring code?  If
you mean that it's still the extension that does the timing, it seems
strange to have struct members and parameter arguments for something
specific that the core code doesn't understand.

As a more general version of that, I wondered about having some kind
of associative data structure (hash table, assoc list, etc) that would
somehow travel everywhere with the PlannedStmt, but not inside it,
just for the use of extensions.  Then planning time could be stored in
there by a planner hook, and the fished out by any other hook that
knows the same key (not sure how you manage that key space but I'm
sure you could come up with something).  That could have uses for
other extensions too, and could also be used for the "query ID", which
is, after all, the model that the abandoned planning_time member was
following.  Just a thought.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Planning counters in pg_stat_statements

2018-05-15 Thread Dmitry Ivanov

Hi,

Is anybody still working on this? Are there any plans to add this to 
commitfest?


I'd like to add planning time to auto_explain, and it turns out that 
this patch is somewhat relevant to that feature.


The current approach here is to set planning_time in PlannedStmt via 
planner_hook, which (in my opinion) has several flaws:


1. Depending on the order of extensions in shared_preload_libraries, it 
might not measure time spent on preceding planner hooks.


2. Provided that there are multiple users of this metric, it might 
become a little too costy to register several hooks with identical 
purpose.


3. [Bikeshedding] Although planning time is stored in PlannedStmt, it's 
definitely not an inherent property of a plan. You could have two 
machines with identical settings but quite different planning times due 
to various circumstances (raw CPU power, I/O etc).


I'd argue that it might be better to add a new argument to 
pg_plan_query() and pg_plan_queries() and a new field to QueryDesc, 
i.e.:


PlannedStmt *
pg_plan_query(Query *querytree,
  int cursorOptions,
  ParamListInfo boundParams,
  double *planningTime)

List *
pg_plan_queries(List *querytrees,
int cursorOptions,
ParamListInfo boundParams,
double *planningTime) /* total time as in 
BuildCachedPlan() */

The measured time can later be passed to QueryDesc via 
PortalDefineQuery(). Of course, this requires more changes, but the 
result might be worth it.


What do you think?

--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: [HACKERS] Planning counters in pg_stat_statements

2018-03-30 Thread Thomas Munro
On Sat, Mar 31, 2018 at 8:04 AM, Lukas Fittl  wrote:
> On Tue, Jan 23, 2018 at 3:31 PM, Tom Lane  wrote:
>>
>> > Unfortunately I'm not going to have bandwidth to figure this out
>> > during this commitfest due to other priorities so I'm going to call
>> > this "returned with feedback".
>>
>> OK.  There's still time to get it done in the March 'fest.
>
>
> I've had an interesting incident earlier this week where seeing planning
> time in pg_stat_statements would have been very helpful to determine the
> root cause more quickly.
>
> Essentially the situation was a statement that executed < 1ms but took more
> than 250ms to plan, running often - once we found the statement (using
> pg_stat_activity sampling) we were able to fix quickly by rewriting the
> query and reduce system load from ~8 to ~1. Having this patch in
> pg_stat_statements would have been critical to get the full picture of what
> was going on earlier.
>
> Thomas: I'm not as familiar with planner internals as you are, but happy to
> try and contribute here. Would it be useful for me to try to adjust the
> patch to Tom's proposal?

Hi Lukas,

Please feel free!  I'm sure this will be a useful feature.
Unfortunately I wasn't able to get back to it in time for PG11.  It
did make an attempt though, and got stuck on some confusion about
whether you want the PREPARE or the "real" statement when using
prepared plans;  I don't recall the details now.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Planning counters in pg_stat_statements

2018-03-30 Thread Lukas Fittl
On Tue, Jan 23, 2018 at 3:31 PM, Tom Lane  wrote:
>
> > Unfortunately I'm not going to have bandwidth to figure this out
> > during this commitfest due to other priorities so I'm going to call
> > this "returned with feedback".
>
> OK.  There's still time to get it done in the March 'fest.
>

I've had an interesting incident earlier this week where seeing planning
time in pg_stat_statements would have been very helpful to determine the
root cause more quickly.

Essentially the situation was a statement that executed < 1ms but took more
than 250ms to plan, running often - once we found the statement (using
pg_stat_activity sampling) we were able to fix quickly by rewriting the
query and reduce system load from ~8 to ~1. Having this patch in
pg_stat_statements would have been critical to get the full picture of what
was going on earlier.

Thomas: I'm not as familiar with planner internals as you are, but happy to
try and contribute here. Would it be useful for me to try to adjust the
patch to Tom's proposal?

Best,
Lukas

-- 
Lukas Fittl


Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-23 Thread Tom Lane
Thomas Munro  writes:
> One problem is that pgss_planner_hook doesn't have the source query
> text.  That problem could be solved by adding a query_string argument
> to the planner hook function type and also planner(),
> standard_planner(), pg_plan_query(), pg_plan_queries().  I don't know
> if that change would be acceptable or if there is a better way that
> doesn't change extern functions that will annoy extension owners.

Within the planner, I'd be inclined to store the query string pointer in
PlannerGlobal (which is accessible from PlannerInfo "root"), but I'm
not sure how many of the functions you mention would still need an
explicit signature change.  Anyway that doesn't particularly bother
me --- it's something that might need to happen anyway, if we ever
hope to throw errors with location pointers from inside the planner.

> Something I wondered about but didn't try: we could skip the above
> problem AND the extra pgss_store() by instead pushing (query ID,
> planning time) into a backend-private buffer and then later pushing it
> into shmem when we eventually call pgss_store() for the execution
> counters.

Meh.  Part of the reason I don't like what you submitted before is that
it supposes there's a mostly one-to-one relationship between planner calls
and executor calls; which there is not, when you start considering edge
cases like prepared statements.  A global would make that worse.

> Unfortunately I'm not going to have bandwidth to figure this out
> during this commitfest due to other priorities so I'm going to call
> this "returned with feedback".

OK.  There's still time to get it done in the March 'fest.

regards, tom lane



Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-23 Thread Thomas Munro
On Fri, Jan 19, 2018 at 12:14 AM, Thomas Munro
 wrote:
> On Sat, Jan 13, 2018 at 2:16 PM, Tom Lane  wrote:
>> What we could/should do instead, I think, is have pgss_planner_hook
>> make its own pgss_store() call to log the planning time results
>> (which would mean we don't need the added PlannedStmt field at all).
>> That would increase the overhead of this feature, which might mean
>> that it'd be worth having a pg_stat_statements GUC to enable it.
>> But it'd be a whole lot cleaner.
>
> Ok.  It seems a bit strange to have to go through the locking twice,
> but I don't have a better idea.  First attempt seems to be working.

One problem is that pgss_planner_hook doesn't have the source query
text.  That problem could be solved by adding a query_string argument
to the planner hook function type and also planner(),
standard_planner(), pg_plan_query(), pg_plan_queries().  I don't know
if that change would be acceptable or if there is a better way that
doesn't change extern functions that will annoy extension owners.
When I tried that it basically worked except for a problem with
"PREPARE ... " (what we want to show) vs "" (what my experimental patch now shows -- oops).

Something I wondered about but didn't try: we could skip the above
problem AND the extra pgss_store() by instead pushing (query ID,
planning time) into a backend-private buffer and then later pushing it
into shmem when we eventually call pgss_store() for the execution
counters.  If you think of that as using global variables to pass
state between functions just because we didn't structure our program
right it sounds terrible, but if you think of it as a dtrace-style
per-thread tracing event buffer that improves performance by batching
up collected data, it sounds great :-D

Unfortunately I'm not going to have bandwidth to figure this out
during this commitfest due to other priorities so I'm going to call
this "returned with feedback".

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-18 Thread Thomas Munro
On Sat, Jan 13, 2018 at 2:16 PM, Tom Lane  wrote:
> I wrote:
>> Haribabu Kommi  writes:
>>> I checked the latest patch and it is working fine and I don't have any
>>> further comments. Marked the patch as "ready for committer".
>
>> I started to look at this patch,

Thanks!

> ... looking further, I'm really seriously unhappy about this bit:
>
> @@ -943,9 +1006,16 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
> ...
> +
> +   /*
> +* Clear planning_time, so that we only count it once for each
> +* replanning of a prepared statement.
> +*/
> +   queryDesc->plannedstmt->planning_time = 0;
> }
>
> What we have here is pgss_ExecutorEnd stomping on the plan cache.
> I do *not* find that acceptable.  At the very least, it ruins the
> theory that this field is a shared resource.

The idea was that planning_time is work space for extensions to do
whatever they like with, but objection noted.

> What we could/should do instead, I think, is have pgss_planner_hook
> make its own pgss_store() call to log the planning time results
> (which would mean we don't need the added PlannedStmt field at all).
> That would increase the overhead of this feature, which might mean
> that it'd be worth having a pg_stat_statements GUC to enable it.
> But it'd be a whole lot cleaner.

Ok.  It seems a bit strange to have to go through the locking twice,
but I don't have a better idea.  First attempt seems to be working.
I'll post an updated patch in a couple of days when I'm back from
travelling and have a tidier version with a new GUC and have thought
about the nested_level question.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-12 Thread Tom Lane
I wrote:
> Haribabu Kommi  writes:
>> I checked the latest patch and it is working fine and I don't have any
>> further comments. Marked the patch as "ready for committer".

> I started to look at this patch,

... looking further, I'm really seriously unhappy about this bit:

@@ -943,9 +1006,16 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
...
+
+   /*
+* Clear planning_time, so that we only count it once for each
+* replanning of a prepared statement.
+*/
+   queryDesc->plannedstmt->planning_time = 0;
}

What we have here is pgss_ExecutorEnd stomping on the plan cache.
I do *not* find that acceptable.  At the very least, it ruins the
theory that this field is a shared resource.

What we could/should do instead, I think, is have pgss_planner_hook
make its own pgss_store() call to log the planning time results
(which would mean we don't need the added PlannedStmt field at all).
That would increase the overhead of this feature, which might mean
that it'd be worth having a pg_stat_statements GUC to enable it.
But it'd be a whole lot cleaner.

regards, tom lane



Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-12 Thread Tom Lane
Haribabu Kommi  writes:
> I checked the latest patch and it is working fine and I don't have any
> further comments. Marked the patch as "ready for committer".

I started to look at this patch, and I'm not entirely convinced whether
it is a good thing for the planner_hook to bump nested_level.  ISTM
that will change behavior in some ways, in particular when the planner
chooses to evaluate an immutable or stable function that runs the
executor (probably via SPI).  Before, that execution would have been
regarded as a top-level call, now it will not be.  Maybe that's fine,
but did anyone think hard about it?

A possible alternative behavior is for planner_hook to maintain its
own nesting depth counter, separate from the one for execution
nesting depth.  I'm not sure if that's better or not.

Discuss ...

regards, tom lane



Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-11 Thread Haribabu Kommi
On Thu, Jan 11, 2018 at 10:00 PM, Thomas Munro <
thomas.mu...@enterprisedb.com> wrote:

> On Thu, Jan 11, 2018 at 7:36 PM, Haribabu Kommi
>  wrote:
>
> > +OUT plans int8,
> >
> > Addition of this column is good to find out how many time the plan is
> > generated
> > for the same query. But I am thinking "plans" column name may confuse?
>
> What else would you call that?  It's the number of plans that have
> been created.  It sits nicely beside "calls" don't you think?  I could
> change it to "plan_count" but that's longer and not like "calls".
>

OK.

I checked the latest patch and it is working fine and I don't have any
further comments. Marked the patch as "ready for committer".

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Planning counters in pg_stat_statements

2018-01-10 Thread Haribabu Kommi
On Tue, Nov 7, 2017 at 4:10 PM, Thomas Munro 
wrote:

> Hi hackers,
>
> I have often wanted $SUBJECT and was happy to find that Fujii-san had
> posted a patch five years ago[1].  The reception then seemed positive.
> So here is a refurbished and (hopefully) improved version of his patch
> with a new column for the replan count.  Thoughts?
>

Thanks for the patch.

-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA = pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

The current version of the pg_stat_statements is already 1.5, this patch
should increase it to 1.6.

+/* contrib/pg_stat_statements/pg_stat_statements--1.5.sql */

Add the contents of the pg_stat_statements--1.4--1.5.sql file here,
otherwise
when the user installs this as a new extension, he will loose those changes.

+OUT plans int8,

Addition of this column is good to find out how many time the plan is
generated
for the same query. But I am thinking "plans" column name may confuse?

+OUT plan_time float8,

I am fine with the current name also, just checking how about
"planning_time"
similar like EXPLAIN?


+ 
+  plan_time
+  double precision
+  
+  Total time spent planning, in milliseconds
+ 

How about "Total time spent in the statement planning"?

And also it needs an update for the total_time column as
"Total time spent in the statement execution".

+
 /* 
  * PlannedStmt node
  *

Extra change.


Regards,
Hari Babu
Fujitsu Australia