Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Robert Haas
On Wed, Mar 7, 2012 at 9:59 PM, Fujii Masao masao.fu...@gmail.com wrote:
 I'd like to have the planning time in a number of other places as
 well, such as EXPLAIN, and maybe statistics views.

 +1 for EXPLAIN. But which statistics views are in your mind?

I don't know.  I'm not sure if it's interesting to be able to count
planning time vs. execution time on a server-wide basis, or even a
per-database basis, but it seems like it might be, if we can do it
cheaply.  Then again, considering that gettimeofday is kinda
expensive, I suppose that would have to be optional if we were to have
it at all.  Just thinking out loud, mostly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Peter Geoghegan
On 8 March 2012 13:09, Robert Haas robertmh...@gmail.com wrote:
 Then again, considering that gettimeofday is kinda
 expensive, I suppose that would have to be optional if we were to have
 it at all.

+1. I'm not opposed to having such a mechanism, but it really ought to
impose exactly no overhead on the common case where we don't
particularly care about plan time.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 On 8 March 2012 13:09, Robert Haas robertmh...@gmail.com wrote:
  Then again, considering that gettimeofday is kinda
 expensive, I suppose that would have to be optional if we were to have
 it at all.

 +1. I'm not opposed to having such a mechanism, but it really ought to
 impose exactly no overhead on the common case where we don't
 particularly care about plan time.

I thought the proposal was to add it to (1) pg_stat_statement and (2)
EXPLAIN, both of which are not in the normal code execution path.
pg_stat_statement is already a drag on a machine with slow gettimeofday,
but it's not clear why users of it would think that two gettimeofday's
per query are acceptable and four are not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-08 Thread Peter Geoghegan
On 8 March 2012 14:44, Tom Lane t...@sss.pgh.pa.us wrote:
 I thought the proposal was to add it to (1) pg_stat_statement and (2)
 EXPLAIN, both of which are not in the normal code execution path.
 pg_stat_statement is already a drag on a machine with slow gettimeofday,
 but it's not clear why users of it would think that two gettimeofday's
 per query are acceptable and four are not.

To be clear, I don't see any reasons to not just have it within
EXPLAIN output under all circumstances.

pg_stat_statements will slow down query execution, but I see no reason
to force users to pay for something that they may well not want by not
including an 'off' switch for this additional instrumentation, given
that it doubles the number of gettimeofdays. I'm not particularly
concerned about platforms with slow gettimeofdays. I'm concerned with
keeping the overhead of running pg_stat_statements as low as possible
generally.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Fujii Masao
Hi,

pg_stat_statements is basically very helpful to find out slow queries.
But since it doesn't report the time spent in the planner, we cannot
find out slow queries which take most time to do query planning, from
pg_stat_statements. Is there any reason why pg_stat_statements doesn't
collect the planning time?

Attached patch extends pg_stat_statements so that it reports the
planning time. Thought?

In the patch, I didn't change the column name total_time meaning
the time spent in the executor because of the backward compatibility.
But once new column plan_time is added, total_time is confusing and
ISTM it should be renamed...

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
*** a/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql
--- b/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql
***
*** 17,22  CREATE FUNCTION pg_stat_statements(
--- 17,23 
  OUT dbid oid,
  OUT query text,
  OUT calls int8,
+ OUT plan_time float8,
  OUT total_time float8,
  OUT rows int8,
  OUT shared_blks_hit int8,
*** a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql
--- b/contrib/pg_stat_statements/pg_stat_statements--1.1.sql
***
*** 14,19  CREATE FUNCTION pg_stat_statements(
--- 14,20 
  OUT dbid oid,
  OUT query text,
  OUT calls int8,
+ OUT plan_time float8,
  OUT total_time float8,
  OUT rows int8,
  OUT shared_blks_hit int8,
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***
*** 1,7 
  /*-
   *
   * pg_stat_statements.c
!  *		Track statement execution times across a whole database cluster.
   *
   * Note about locking issues: to create or delete an entry in the shared
   * hashtable, one must hold pgss-lock exclusively.  Modifying any field
--- 1,7 
  /*-
   *
   * pg_stat_statements.c
!  *		Track statement planning and execution times across a whole database cluster.
   *
   * Note about locking issues: to create or delete an entry in the shared
   * hashtable, one must hold pgss-lock exclusively.  Modifying any field
***
*** 27,32 
--- 27,33 
  #include funcapi.h
  #include mb/pg_wchar.h
  #include miscadmin.h
+ #include optimizer/planner.h
  #include pgstat.h
  #include storage/fd.h
  #include storage/ipc.h
***
*** 73,78  typedef struct pgssHashKey
--- 74,80 
  typedef struct Counters
  {
  	int64		calls;			/* # of times executed */
+ 	double		plan_time;		/* total planning time in seconds */
  	double		total_time;		/* total execution time in seconds */
  	int64		rows;			/* total # of retrieved or affected rows */
  	int64		shared_blks_hit;	/* # of shared buffer hits */
***
*** 118,123  static int	nested_level = 0;
--- 120,126 
  
  /* Saved hook values in case of unload */
  static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+ static planner_hook_type prev_planner_hook = NULL;
  static ExecutorStart_hook_type prev_ExecutorStart = NULL;
  static ExecutorRun_hook_type prev_ExecutorRun = NULL;
  static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
***
*** 168,173  PG_FUNCTION_INFO_V1(pg_stat_statements);
--- 171,178 
  
  static void pgss_shmem_startup(void);
  static void pgss_shmem_shutdown(int code, Datum arg);
+ static PlannedStmt *pgss_planner(Query *parse, int cursorOptions,
+  ParamListInfo boundParams);
  static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags);
  static void pgss_ExecutorRun(QueryDesc *queryDesc,
   ScanDirection direction,
***
*** 179,186  static void pgss_ProcessUtility(Node *parsetree,
  	DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static void pgss_store(const char *query, double total_time, uint64 rows,
! 		   const BufferUsage *bufusage);
  static Size pgss_memsize(void);
  static pgssEntry *entry_alloc(pgssHashKey *key);
  static void entry_dealloc(void);
--- 184,191 
  	DestReceiver *dest, char *completionTag);
  static uint32 pgss_hash_fn(const void *key, Size keysize);
  static int	pgss_match_fn(const void *key1, const void *key2, Size keysize);
! static void pgss_store(const char *query, double plan_time, double total_time,
! 		   uint64 rows, const BufferUsage *bufusage);
  static Size pgss_memsize(void);
  static pgssEntry *entry_alloc(pgssHashKey *key);
  static void entry_dealloc(void);
***
*** 269,274  _PG_init(void)
--- 274,281 
  	 */
  	prev_shmem_startup_hook = shmem_startup_hook;
  	shmem_startup_hook = pgss_shmem_startup;
+ 	prev_planner_hook = planner_hook;
+ 	planner_hook = 

Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Simon Riggs
On Wed, Mar 7, 2012 at 11:45 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Attached patch extends pg_stat_statements so that it reports the
 planning time. Thought?

If we successfully aggregate SQL in the current patch then this might
be useful as well. Until we do that it's not much use.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Fujii Masao
On Wed, Mar 7, 2012 at 9:00 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Wed, Mar 7, 2012 at 11:45 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Attached patch extends pg_stat_statements so that it reports the
 planning time. Thought?

 If we successfully aggregate SQL in the current patch then this might
 be useful as well. Until we do that it's not much use.

You mean pg_stat_statements normalization patch? Yes, if it will be applied,
this patch would be more useful. But without the normalization patch, this patch
would be useful for a user who use the extended protocol but not plan cache
mechanism, I think.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Robert Haas
On Wed, Mar 7, 2012 at 6:45 AM, Fujii Masao masao.fu...@gmail.com wrote:
 pg_stat_statements is basically very helpful to find out slow queries.
 But since it doesn't report the time spent in the planner, we cannot
 find out slow queries which take most time to do query planning, from
 pg_stat_statements. Is there any reason why pg_stat_statements doesn't
 collect the planning time?

 Attached patch extends pg_stat_statements so that it reports the
 planning time. Thought?

I think this is an interesting idea, but I think it's too late for 9.2.

I'd like to have the planning time in a number of other places as
well, such as EXPLAIN, and maybe statistics views.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 In the patch, I didn't change the column name total_time meaning
 the time spent in the executor because of the backward compatibility.
 But once new column plan_time is added, total_time is confusing and
 ISTM it should be renamed...

Well, if we were tracking planning time, what I would expect
total_time to mean is plan time plus execution time.  Should it be
redefined that way, instead of renaming it?

Another point here is that because of plan caching, the number of
planner invocations could be quite different from the number of executor
runs.  It's not clear to me whether this will confuse matters for
pg_stat_statements, but it's something to think about.  Will it be
possible to tell whether a particular statement is hugely expensive to
plan but we don't do that often, versus cheap to plan but we do that a
lot?  IOW I am wondering if we need to track the number of invocations
as well as total time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Daniel Farina
On Wed, Mar 7, 2012 at 8:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 In the patch, I didn't change the column name total_time meaning
 the time spent in the executor because of the backward compatibility.
 But once new column plan_time is added, total_time is confusing and
 ISTM it should be renamed...

 Well, if we were tracking planning time, what I would expect
 total_time to mean is plan time plus execution time.  Should it be
 redefined that way, instead of renaming it?

I think you are right.

On first glance, a user would only be interested in one number, the
total time.  Most people are hunting queries that happen
(surprisingly) frequently and don't have good indexes in place to
serve them, and in those cases planning time is negligible.  However,
should planning time be the bottleneck then one has to investigate
fewer, smaller queries, a very different solution space.

 Another point here is that because of plan caching, the number of
 planner invocations could be quite different from the number of executor
 runs.  It's not clear to me whether this will confuse matters for
 pg_stat_statements, but it's something to think about.  Will it be
 possible to tell whether a particular statement is hugely expensive to
 plan but we don't do that often, versus cheap to plan but we do that a
 lot?  IOW I am wondering if we need to track the number of invocations
 as well as total time.

I don't think tracking a few more words will do much harm, and could
also do a lot of goodperhaps count, sum, sum(x**2) of planning and
execution?

(Few things to me are more annoying than an unadorned average when it
would have been feasible to take a square of a number).

Thoughts?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Fujii Masao
On Thu, Mar 8, 2012 at 12:39 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 7, 2012 at 6:45 AM, Fujii Masao masao.fu...@gmail.com wrote:
 pg_stat_statements is basically very helpful to find out slow queries.
 But since it doesn't report the time spent in the planner, we cannot
 find out slow queries which take most time to do query planning, from
 pg_stat_statements. Is there any reason why pg_stat_statements doesn't
 collect the planning time?

 Attached patch extends pg_stat_statements so that it reports the
 planning time. Thought?

 I think this is an interesting idea, but I think it's too late for 9.2.

Yes. I will add this to the next commitfest. And, in the patch, I changed
pg_stat_statements--1.1.sql, but, for 9.3 I will have to create 1.2.sql instead.

 I'd like to have the planning time in a number of other places as
 well, such as EXPLAIN, and maybe statistics views.

+1 for EXPLAIN. But which statistics views are in your mind?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements and planning time

2012-03-07 Thread Fujii Masao
On Thu, Mar 8, 2012 at 1:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 In the patch, I didn't change the column name total_time meaning
 the time spent in the executor because of the backward compatibility.
 But once new column plan_time is added, total_time is confusing and
 ISTM it should be renamed...

 Well, if we were tracking planning time, what I would expect
 total_time to mean is plan time plus execution time.  Should it be
 redefined that way, instead of renaming it?

Agreed, it's more intuitive for a user.

Along with total_time and plan_time, should we also define exec_time
reporting only the execution time for improvement of usability though it can
be calculated from total_time and plan_time?

 Another point here is that because of plan caching, the number of
 planner invocations could be quite different from the number of executor
 runs.  It's not clear to me whether this will confuse matters for
 pg_stat_statements, but it's something to think about.  Will it be
 possible to tell whether a particular statement is hugely expensive to
 plan but we don't do that often, versus cheap to plan but we do that a
 lot?  IOW I am wondering if we need to track the number of invocations
 as well as total time.

Agreed to add something like plan_count column. This also would be helpful
for e.g., tuning the prepareThreshold parameter in JDBC.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers