Re: [HACKERS] pg_stat_statements and planning time
On Wed, Mar 7, 2012 at 9:59 PM, Fujii Masao masao.fu...@gmail.com wrote: I'd like to have the planning time in a number of other places as well, such as EXPLAIN, and maybe statistics views. +1 for EXPLAIN. But which statistics views are in your mind? I don't know. I'm not sure if it's interesting to be able to count planning time vs. execution time on a server-wide basis, or even a per-database basis, but it seems like it might be, if we can do it cheaply. Then again, considering that gettimeofday is kinda expensive, I suppose that would have to be optional if we were to have it at all. Just thinking out loud, mostly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On 8 March 2012 13:09, Robert Haas robertmh...@gmail.com wrote: Then again, considering that gettimeofday is kinda expensive, I suppose that would have to be optional if we were to have it at all. +1. I'm not opposed to having such a mechanism, but it really ought to impose exactly no overhead on the common case where we don't particularly care about plan time. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
Peter Geoghegan pe...@2ndquadrant.com writes: On 8 March 2012 13:09, Robert Haas robertmh...@gmail.com wrote: Then again, considering that gettimeofday is kinda expensive, I suppose that would have to be optional if we were to have it at all. +1. I'm not opposed to having such a mechanism, but it really ought to impose exactly no overhead on the common case where we don't particularly care about plan time. I thought the proposal was to add it to (1) pg_stat_statement and (2) EXPLAIN, both of which are not in the normal code execution path. pg_stat_statement is already a drag on a machine with slow gettimeofday, but it's not clear why users of it would think that two gettimeofday's per query are acceptable and four are not. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On 8 March 2012 14:44, Tom Lane t...@sss.pgh.pa.us wrote: I thought the proposal was to add it to (1) pg_stat_statement and (2) EXPLAIN, both of which are not in the normal code execution path. pg_stat_statement is already a drag on a machine with slow gettimeofday, but it's not clear why users of it would think that two gettimeofday's per query are acceptable and four are not. To be clear, I don't see any reasons to not just have it within EXPLAIN output under all circumstances. pg_stat_statements will slow down query execution, but I see no reason to force users to pay for something that they may well not want by not including an 'off' switch for this additional instrumentation, given that it doubles the number of gettimeofdays. I'm not particularly concerned about platforms with slow gettimeofdays. I'm concerned with keeping the overhead of running pg_stat_statements as low as possible generally. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_stat_statements and planning time
Hi, pg_stat_statements is basically very helpful to find out slow queries. But since it doesn't report the time spent in the planner, we cannot find out slow queries which take most time to do query planning, from pg_stat_statements. Is there any reason why pg_stat_statements doesn't collect the planning time? Attached patch extends pg_stat_statements so that it reports the planning time. Thought? In the patch, I didn't change the column name total_time meaning the time spent in the executor because of the backward compatibility. But once new column plan_time is added, total_time is confusing and ISTM it should be renamed... Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center *** a/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql --- b/contrib/pg_stat_statements/pg_stat_statements--1.0--1.1.sql *** *** 17,22 CREATE FUNCTION pg_stat_statements( --- 17,23 OUT dbid oid, OUT query text, OUT calls int8, + OUT plan_time float8, OUT total_time float8, OUT rows int8, OUT shared_blks_hit int8, *** a/contrib/pg_stat_statements/pg_stat_statements--1.1.sql --- b/contrib/pg_stat_statements/pg_stat_statements--1.1.sql *** *** 14,19 CREATE FUNCTION pg_stat_statements( --- 14,20 OUT dbid oid, OUT query text, OUT calls int8, + OUT plan_time float8, OUT total_time float8, OUT rows int8, OUT shared_blks_hit int8, *** a/contrib/pg_stat_statements/pg_stat_statements.c --- b/contrib/pg_stat_statements/pg_stat_statements.c *** *** 1,7 /*- * * pg_stat_statements.c ! * Track statement execution times across a whole database cluster. * * Note about locking issues: to create or delete an entry in the shared * hashtable, one must hold pgss-lock exclusively. Modifying any field --- 1,7 /*- * * pg_stat_statements.c ! * Track statement planning and execution times across a whole database cluster. * * Note about locking issues: to create or delete an entry in the shared * hashtable, one must hold pgss-lock exclusively. Modifying any field *** *** 27,32 --- 27,33 #include funcapi.h #include mb/pg_wchar.h #include miscadmin.h + #include optimizer/planner.h #include pgstat.h #include storage/fd.h #include storage/ipc.h *** *** 73,78 typedef struct pgssHashKey --- 74,80 typedef struct Counters { int64 calls; /* # of times executed */ + double plan_time; /* total planning time in seconds */ double total_time; /* total execution time in seconds */ int64 rows; /* total # of retrieved or affected rows */ int64 shared_blks_hit; /* # of shared buffer hits */ *** *** 118,123 static int nested_level = 0; --- 120,126 /* Saved hook values in case of unload */ static shmem_startup_hook_type prev_shmem_startup_hook = NULL; + static planner_hook_type prev_planner_hook = NULL; static ExecutorStart_hook_type prev_ExecutorStart = NULL; static ExecutorRun_hook_type prev_ExecutorRun = NULL; static ExecutorFinish_hook_type prev_ExecutorFinish = NULL; *** *** 168,173 PG_FUNCTION_INFO_V1(pg_stat_statements); --- 171,178 static void pgss_shmem_startup(void); static void pgss_shmem_shutdown(int code, Datum arg); + static PlannedStmt *pgss_planner(Query *parse, int cursorOptions, + ParamListInfo boundParams); static void pgss_ExecutorStart(QueryDesc *queryDesc, int eflags); static void pgss_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, *** *** 179,186 static void pgss_ProcessUtility(Node *parsetree, DestReceiver *dest, char *completionTag); static uint32 pgss_hash_fn(const void *key, Size keysize); static int pgss_match_fn(const void *key1, const void *key2, Size keysize); ! static void pgss_store(const char *query, double total_time, uint64 rows, ! const BufferUsage *bufusage); static Size pgss_memsize(void); static pgssEntry *entry_alloc(pgssHashKey *key); static void entry_dealloc(void); --- 184,191 DestReceiver *dest, char *completionTag); static uint32 pgss_hash_fn(const void *key, Size keysize); static int pgss_match_fn(const void *key1, const void *key2, Size keysize); ! static void pgss_store(const char *query, double plan_time, double total_time, ! uint64 rows, const BufferUsage *bufusage); static Size pgss_memsize(void); static pgssEntry *entry_alloc(pgssHashKey *key); static void entry_dealloc(void); *** *** 269,274 _PG_init(void) --- 274,281 */ prev_shmem_startup_hook = shmem_startup_hook; shmem_startup_hook = pgss_shmem_startup; + prev_planner_hook = planner_hook; + planner_hook =
Re: [HACKERS] pg_stat_statements and planning time
On Wed, Mar 7, 2012 at 11:45 AM, Fujii Masao masao.fu...@gmail.com wrote: Attached patch extends pg_stat_statements so that it reports the planning time. Thought? If we successfully aggregate SQL in the current patch then this might be useful as well. Until we do that it's not much use. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On Wed, Mar 7, 2012 at 9:00 PM, Simon Riggs si...@2ndquadrant.com wrote: On Wed, Mar 7, 2012 at 11:45 AM, Fujii Masao masao.fu...@gmail.com wrote: Attached patch extends pg_stat_statements so that it reports the planning time. Thought? If we successfully aggregate SQL in the current patch then this might be useful as well. Until we do that it's not much use. You mean pg_stat_statements normalization patch? Yes, if it will be applied, this patch would be more useful. But without the normalization patch, this patch would be useful for a user who use the extended protocol but not plan cache mechanism, I think. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On Wed, Mar 7, 2012 at 6:45 AM, Fujii Masao masao.fu...@gmail.com wrote: pg_stat_statements is basically very helpful to find out slow queries. But since it doesn't report the time spent in the planner, we cannot find out slow queries which take most time to do query planning, from pg_stat_statements. Is there any reason why pg_stat_statements doesn't collect the planning time? Attached patch extends pg_stat_statements so that it reports the planning time. Thought? I think this is an interesting idea, but I think it's too late for 9.2. I'd like to have the planning time in a number of other places as well, such as EXPLAIN, and maybe statistics views. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
Fujii Masao masao.fu...@gmail.com writes: In the patch, I didn't change the column name total_time meaning the time spent in the executor because of the backward compatibility. But once new column plan_time is added, total_time is confusing and ISTM it should be renamed... Well, if we were tracking planning time, what I would expect total_time to mean is plan time plus execution time. Should it be redefined that way, instead of renaming it? Another point here is that because of plan caching, the number of planner invocations could be quite different from the number of executor runs. It's not clear to me whether this will confuse matters for pg_stat_statements, but it's something to think about. Will it be possible to tell whether a particular statement is hugely expensive to plan but we don't do that often, versus cheap to plan but we do that a lot? IOW I am wondering if we need to track the number of invocations as well as total time. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On Wed, Mar 7, 2012 at 8:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: In the patch, I didn't change the column name total_time meaning the time spent in the executor because of the backward compatibility. But once new column plan_time is added, total_time is confusing and ISTM it should be renamed... Well, if we were tracking planning time, what I would expect total_time to mean is plan time plus execution time. Should it be redefined that way, instead of renaming it? I think you are right. On first glance, a user would only be interested in one number, the total time. Most people are hunting queries that happen (surprisingly) frequently and don't have good indexes in place to serve them, and in those cases planning time is negligible. However, should planning time be the bottleneck then one has to investigate fewer, smaller queries, a very different solution space. Another point here is that because of plan caching, the number of planner invocations could be quite different from the number of executor runs. It's not clear to me whether this will confuse matters for pg_stat_statements, but it's something to think about. Will it be possible to tell whether a particular statement is hugely expensive to plan but we don't do that often, versus cheap to plan but we do that a lot? IOW I am wondering if we need to track the number of invocations as well as total time. I don't think tracking a few more words will do much harm, and could also do a lot of goodperhaps count, sum, sum(x**2) of planning and execution? (Few things to me are more annoying than an unadorned average when it would have been feasible to take a square of a number). Thoughts? -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On Thu, Mar 8, 2012 at 12:39 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Mar 7, 2012 at 6:45 AM, Fujii Masao masao.fu...@gmail.com wrote: pg_stat_statements is basically very helpful to find out slow queries. But since it doesn't report the time spent in the planner, we cannot find out slow queries which take most time to do query planning, from pg_stat_statements. Is there any reason why pg_stat_statements doesn't collect the planning time? Attached patch extends pg_stat_statements so that it reports the planning time. Thought? I think this is an interesting idea, but I think it's too late for 9.2. Yes. I will add this to the next commitfest. And, in the patch, I changed pg_stat_statements--1.1.sql, but, for 9.3 I will have to create 1.2.sql instead. I'd like to have the planning time in a number of other places as well, such as EXPLAIN, and maybe statistics views. +1 for EXPLAIN. But which statistics views are in your mind? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_stat_statements and planning time
On Thu, Mar 8, 2012 at 1:07 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: In the patch, I didn't change the column name total_time meaning the time spent in the executor because of the backward compatibility. But once new column plan_time is added, total_time is confusing and ISTM it should be renamed... Well, if we were tracking planning time, what I would expect total_time to mean is plan time plus execution time. Should it be redefined that way, instead of renaming it? Agreed, it's more intuitive for a user. Along with total_time and plan_time, should we also define exec_time reporting only the execution time for improvement of usability though it can be calculated from total_time and plan_time? Another point here is that because of plan caching, the number of planner invocations could be quite different from the number of executor runs. It's not clear to me whether this will confuse matters for pg_stat_statements, but it's something to think about. Will it be possible to tell whether a particular statement is hugely expensive to plan but we don't do that often, versus cheap to plan but we do that a lot? IOW I am wondering if we need to track the number of invocations as well as total time. Agreed to add something like plan_count column. This also would be helpful for e.g., tuning the prepareThreshold parameter in JDBC. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers