Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-03-04 Thread Julian Markwort
Alright, for the next version of this patch I'll look into standard 
deviation (an implementation of Welfords' algorithm already exists in 
pg_stat_statements).


On 3/4/17 14:18, Peter Eisentraut wrote:


The other problem is that this measures execution time, which can vary
for reasons other than plan.  I would have expected that the cost
numbers are tracked somehow.
I've already thought of tracking specific parts of the explanation, like 
the cost numbers, instead of the whole string, I'll think of something, 
but if anybody has any bright ideas in the meantime, I'd gladly listen 
to them.



There is also the issue of generic vs specific plans, which this
approach might be papering over.
Would you be so kind and elaborate a little bit on this? I'm not sure if 
I understand this correctly. This patch only tracks specific plans, yes. 
The inital idea was that there might be some edge-cases that are not 
apparent when looking at generalized plans or queries.


kind regards
Julian


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-03-04 Thread David Steele
Hi Julian,

On 3/4/17 8:41 AM, Julian Markwort wrote:
>> Any improvements would be greatly welcome by the admin
>> community, I'm sure.
> That's good to hear - on the other hand, I really appreciate the opinion
> of admins on this idea!
>> However, this is a rather large change which could be destabilizing to
>> the many users of this extension.
> I'm fully aware of that, which is why there are already several switches
> in place so you can keep using the existing functionality without
> compromises or added complexity.
> At the same time, I'm always open to suggestions regarding the reduction
> of complexity and probably more importantly the reduction of disk IO.
>> I recommend moving this patch to the 2017-07 CF.
> Since I do not have very much time for this at the moment I'll be
> looking forward to discussions on this patch in the next commitfest!

Since some concerns were raised about the implementation, I have instead
marked this "Returned with Feedback".  I encourage you to continue
developing the patch with the community and resubmit into the
appropriate CF when it is ready.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-03-04 Thread Julian Markwort

Any improvements would be greatly welcome by the admin
community, I'm sure.
That's good to hear - on the other hand, I really appreciate the opinion 
of admins on this idea!

However, this is a rather large change which could be destabilizing to
the many users of this extension.
I'm fully aware of that, which is why there are already several switches 
in place so you can keep using the existing functionality without 
compromises or added complexity.
At the same time, I'm always open to suggestions regarding the reduction 
of complexity and probably more importantly the reduction of disk IO.

I recommend moving this patch to the 2017-07 CF.
Since I do not have very much time for this at the moment I'll be 
looking forward to discussions on this patch in the next commitfest!


kind regards
Julian


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-03-04 Thread Peter Eisentraut
On 1/25/17 12:43, Simon Riggs wrote:
> On 25 January 2017 at 17:34, Julian Markwort
>  wrote:
> 
>> Analogous to this, a bad_plan is saved, when the time has been exceeded by a
>> factor greater than 1.1 .
> ...and the plan differs?
> 
> Probably best to use some stat math to calculate deviation, rather than fixed 
> %.

Yeah, it seems to me too that this needs a bit more deeper analysis.  I
don't see offhand why a 10% deviation in execution time would be a
reasonable threshold for "good" or "bad".  A deviation approach like you
allude to would be better.

The other problem is that this measures execution time, which can vary
for reasons other than plan.  I would have expected that the cost
numbers are tracked somehow.

There is also the issue of generic vs specific plans, which this
approach might be papering over.

Needs more thought.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-03-03 Thread David Steele
Hi Julian,

On 1/25/17 12:34 PM, Julian Markwort wrote:

> TL:DR;
> We extended the functionality of pg_stat_statements so it can track
> worst and best case execution plans.

pg_stat_statements is an important tool and perhaps one of the most used
core extensions.  Any improvements would be greatly welcome by the admin
community, I'm sure.

However, this is a rather large change which could be destabilizing to
the many users of this extension.  Even though the patch was posted well
in advance of the last CF it has received little discussion so is
essentially new.

I recommend moving this patch to the 2017-07 CF.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-01-25 Thread Simon Riggs
On 25 January 2017 at 17:34, Julian Markwort
 wrote:

> Analogous to this, a bad_plan is saved, when the time has been exceeded by a
> factor greater than 1.1 .

...and the plan differs?

Probably best to use some stat math to calculate deviation, rather than fixed %.

Sounds good.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [FEATURE PATCH] pg_stat_statements with plans

2017-01-25 Thread Julian Markwort

Hello psql-hackers!

TL:DR;
We extended the functionality of pg_stat_statements so it can track 
worst and best case execution plans.


Based on a suggestion of my colleague Arne Scheffer, Marius Timmer and I 
extended pg_stat_statements so it can also record execution plans, 
whenever the execution time is exceeded (or deceeded) by a definable 
factor.
We were largely inspired by the pg_stat_plans extension by Peter 
Geoghegan and Simon Riggs - we don't claim any originality on this part 
- which is unfortunately not available on newer postgresql versions. 
There are a few differences which will become apparent in the following 
lines.


By default, the modified pg_stat_statements extension will now track 
good plans and bad plans for each entry in pg_stat_statements.
The plans are not normalized or hashed (as opposed to pg_stat_plans), 
they represent discreet statements.
A good plan is saved, whenever this sort of query has been used for the 
first time or the time of the previously recorded good plan has been 
deceeded by a smaller factor than 0.9 .
Analogous to this, a bad_plan is saved, when the time has been exceeded 
by a factor greater than 1.1 .
There are GUCs available so these parameters can be tuned to your 
liking. Tracking can be disabled for both plans individually.
A plan_format can be defined to enable better readability or 
processability through other tools.


You can reset your good and bad plans by using a
select on pg_stat_statements_good_plan_reset([queryid]);
resetting bad plans uses pg_stat_statements_bad_plan_reset, obviously.
In case of a reset, the execution time, timestamp and plan itself are 
just set to 0 respective NULL.


The pg_stat_statements view now provides six extra columns:
good_plan, good_plan_time, good_plan_timestamp, bad_plan, bad_plan_time 
and bad_plan_timestamp.


Plans are only displayed if the showtext argument is true and the user 
is the superuser or the user who has been associated with that entry.


Furthermore, we implemented a GUC that allows you to control the maximum 
refresh frequency to avoid performance impacts on restarts or resets.
A plan is only updated when tracking is enabled and more time than 
"plan_min_interval" has passed (default: 5 seconds) and the previously 
mentioned conditions for the execution time have been met.


The major selling point of this feature?
Beeing able to find plans that need optimization (e.g. by creating 
indexes). As pg_stat_statements tracks normalized queries, there might 
be certain values or even daytimes that result in very bad plans, while 
others result in perfectly fine plans.
Of course, the GUC log_min_duration_statement can also detect long 
runners, but the advantage of pg_stat_statements is that we count the 
total calls of normalized queries, which enables us to find plans, that 
don't count as long runners, while their aggregated time might show 
shortcomings regarding their plans.


We've found this sort of tool really useful when dealing with queries 
produced by ORM libraries, where optimization is not intuitive.


Various tests using pg_bench suggest that this extension does not worsen 
the performance of the database.


We're really looking forward to your opinions and feedback on this 
feature patch

Julian, Marius and Arne
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 298951a..2a22eb5 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,7 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o $(WIN32RES)
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.3--1.4.sql \
+DATA = pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql pg_stat_statements--1.3--1.4.sql \
 	pg_stat_statements--1.2--1.3.sql pg_stat_statements--1.1--1.2.sql \
 	pg_stat_statements--1.0--1.1.sql pg_stat_statements--unpackaged--1.0.sql
 PGFILEDESC = "pg_stat_statements - execution statistics of SQL statements"
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 6edc3d9..a3cfe6d 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -61,7 +61,9 @@
 #include 
 #include 
 
+#include "utils/timestamp.h"
 #include "access/hash.h"
+#include "commands/explain.h"
 #include "executor/instrument.h"
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
@@ -118,7 +120,8 @@ typedef enum pgssVersion
 	PGSS_V1_0 = 0,
 	PGSS_V1_1,
 	PGSS_V1_2,
-	PGSS_V1_3
+	PGSS_V1_3,
+	PGSS_V1_5
 } pgssVersion;
 
 /*
@@ -159,6 +162,14 @@ typedef struct Counters
 	double		usage;			/* usage factor */
 } Counters;
 
+typedef struct pgssPlan
+{
+	Size offset;
+	int len;
+	double		time;	/* execution time in msec when the latest plan was updated */
+	TimestampTz timestamp;
+} pgssPlan;
+
 /*
  * Statistics per statement
  *
@@ -172,6 +183,8 @@ typedef struct pgssEntry
 	Counters	counters;