Re: [HACKERS] stored procedure stats in collector
This has been saved for the next commit-fest: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Martin Pihlak wrote: Howdy, Here's an updated version of the function stats patch. The biggest change is that by default only procedural language functions are counted. The GUC variable stats_function_level now takes 3 values: on, off and all. The latter also counts SQL and C language functions, on means only to count procedural language functions. The decision is now made in fmgr_info_cxt_security() Trigger functions are now also counted. Sample output: select procname, calls, total_time, total_cpu, self_time, self_cpu from pg_stat_user_functions order by self_cpu desc limit 5; procname | calls | total_time | total_cpu | self_time | self_cpu +---++---+---+-- next_batch | 32765 | 27139 | 8574 | 27139 | 8574 fetch_batch_events | 3636 | 9252 | 5622 | 3771 | 2717 batch_event_sql| 3636 | 5454 | 2888 | 3910 | 1962 finish_batch | 3420 | 3215 | 1475 | 3215 | 1475 batch_event_tables | 3636 | 1448 | 865 | 1434 | 858 (5 rows) This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production usage here at Skype (about a month on reasonably busy boxes). So far so good. Couple of issues: - sometimes self_cpu self_time - currently blaming it on Linux version of gettimeofday(). - dropped functions are not purged from stats, might bloat the stats file for some usage patterns. PS. Would something like this be a canditate for 8.4 inclusion (if polished up)? Regards, Martin [ application/x-gzip is not supported, skipping... ] ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedure stats in collector
Howdy, Did you post an updated patch to HEAD? No, but I guess its just about time. Hopefully I'll have something to show next week. regards, Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] stored procedure stats in collector
Hi, Good. I'll bring the patch up to date with HEAD. Did you post an updated patch to HEAD? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stored procedure stats in collector
This has been saved for the 8.4 release: http://momjian.postgresql.org/cgi-bin/pgpatches_hold --- Martin Pihlak wrote: Howdy, Here's an updated version of the function stats patch. The biggest change is that by default only procedural language functions are counted. The GUC variable stats_function_level now takes 3 values: on, off and all. The latter also counts SQL and C language functions, on means only to count procedural language functions. The decision is now made in fmgr_info_cxt_security() Trigger functions are now also counted. Sample output: select procname, calls, total_time, total_cpu, self_time, self_cpu from pg_stat_user_functions order by self_cpu desc limit 5; procname | calls | total_time | total_cpu | self_time | self_cpu +---++---+---+-- next_batch | 32765 | 27139 | 8574 | 27139 | 8574 fetch_batch_events | 3636 | 9252 | 5622 | 3771 | 2717 batch_event_sql| 3636 | 5454 | 2888 | 3910 | 1962 finish_batch | 3420 | 3215 | 1475 | 3215 | 1475 batch_event_tables | 3636 | 1448 | 865 | 1434 | 858 (5 rows) This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production usage here at Skype (about a month on reasonably busy boxes). So far so good. Couple of issues: - sometimes self_cpu self_time - currently blaming it on Linux version of gettimeofday(). - dropped functions are not purged from stats, might bloat the stats file for some usage patterns. PS. Would something like this be a canditate for 8.4 inclusion (if polished up)? Regards, Martin [ application/x-gzip is not supported, skipping... ] ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] stored procedure stats in collector
On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote: That seems a confusing set of values. Perhaps off, pl, and all would be clearer? I'm curious if you've measured the performance overhead of enabling this functionality. i'm quite worried about all setting. all operators are functions as well, so tracking literally all function calls might be too heavy. on the other hand i would like to be able to track some non-pl functions like RI%. depesz -- quicksil1er: postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV! :) http://www.depesz.com/ - blog dla ciebie (i moje CV) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] stored procedure stats in collector
hubert depesz lubaczewski wrote: On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote: That seems a confusing set of values. Perhaps off, pl, and all would be clearer? I'm curious if you've measured the performance overhead of enabling this functionality. i'm quite worried about all setting. all operators are functions as well, so tracking literally all function calls might be too heavy. all would additionally count only C and SQL functions, builtins (language internal) are skipped. Maybe there should be option to count these as well, perhaps with a really scare name (track_functions = all_plus_internal). on the other hand i would like to be able to track some non-pl functions like RI%. These are builtin. Regards, Martin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] stored procedure stats in collector
Neil Conway wrote: On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote: The GUC variable stats_function_level now takes 3 values: on, off and all. That seems a confusing set of values. Perhaps off, pl, and all would be clearer? Makes sense. It appears that the stats_ prefixed GUC names are deprecated now. Will rename to track_functions and change values to off, pl and all. Or should I use none instead of off? I'm curious if you've measured the performance overhead of enabling this functionality. Just finished a simple benchmark comparing unpatched, off and all: unpatched usrsystotal dummy 21.36 0.68 22.04 rfact 51.52 0.07 51.59 ifact 20.40 0.06 20.45 off usrsystotal diff x dummy 21.59 0.39 21.97 1.00 rfact 52.40 0.06 52.46 1.02 ifact 20.78 0.02 20.8 1.02 all usrsystotal diff x dummy 21.69 35.18 56.87 2.58 rfact 47.83 38.99 86.83 1.68 ifact 20.65 0.36 21.01 1.03 dummy is blank pl/pgsql function (ran 1000 times). rfact is recursive factorial, ifact is iterative (both ran 100 times). Comparing median of 3 runs. The usr and sys values are from log_executor_stats. It seems that the overhead is unnoticeable if disabled, very visible for lightweight functions and heavy callers. Almost unnoticeable for more compute intensive functions. PS. Would something like this be a canditate for 8.4 inclusion (if polished up)? It sounds like a useful feature to me. Good. I'll bring the patch up to date with HEAD. Regards, Martin ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] stored procedure stats in collector
Martin Pihlak [EMAIL PROTECTED] writes: Neil Conway wrote: That seems a confusing set of values. Perhaps off, pl, and all would be clearer? Makes sense. It appears that the stats_ prefixed GUC names are deprecated now. Will rename to track_functions and change values to off, pl and all. Or should I use none instead of off? None seems good, by analogy with log_statement's values. It seems that the overhead is unnoticeable if disabled, very visible for lightweight functions and heavy callers. Almost unnoticeable for more compute intensive functions. The very high system times seem odd. Maybe you have a machine with slow gettimeofday()? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] stored procedure stats in collector
It seems that the overhead is unnoticeable if disabled, very visible for lightweight functions and heavy callers. Almost unnoticeable for more compute intensive functions. The very high system times seem odd. Maybe you have a machine with slow gettimeofday()? So it seems (on Ubuntu 7.04). Reran the benchmark on another system and got more reasonable results: none usr systotal dummy 15.9 0.68 16.58 rfact 39.370.07 39.44 ifact 12.890.01 12.9 all usr systotaldiff x dummy 20.824.44 25.261.52 rfact 45.174.17 49.341.25 ifact 12.920.03 12.961.00 Regards, Martin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] stored procedure stats in collector
Howdy, Here's an updated version of the function stats patch. The biggest change is that by default only procedural language functions are counted. The GUC variable stats_function_level now takes 3 values: on, off and all. The latter also counts SQL and C language functions, on means only to count procedural language functions. The decision is now made in fmgr_info_cxt_security() Trigger functions are now also counted. Sample output: select procname, calls, total_time, total_cpu, self_time, self_cpu from pg_stat_user_functions order by self_cpu desc limit 5; procname | calls | total_time | total_cpu | self_time | self_cpu +---++---+---+-- next_batch | 32765 | 27139 | 8574 | 27139 | 8574 fetch_batch_events | 3636 | 9252 | 5622 | 3771 | 2717 batch_event_sql| 3636 | 5454 | 2888 | 3910 | 1962 finish_batch | 3420 | 3215 | 1475 | 3215 | 1475 batch_event_tables | 3636 | 1448 | 865 | 1434 | 858 (5 rows) This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production usage here at Skype (about a month on reasonably busy boxes). So far so good. Couple of issues: - sometimes self_cpu self_time - currently blaming it on Linux version of gettimeofday(). - dropped functions are not purged from stats, might bloat the stats file for some usage patterns. PS. Would something like this be a canditate for 8.4 inclusion (if polished up)? Regards, Martin 8.2-procstat.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] stored procedure stats in collector
On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote: The GUC variable stats_function_level now takes 3 values: on, off and all. That seems a confusing set of values. Perhaps off, pl, and all would be clearer? I'm curious if you've measured the performance overhead of enabling this functionality. PS. Would something like this be a canditate for 8.4 inclusion (if polished up)? It sounds like a useful feature to me. -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] stored procedure stats in collector
Neil Conway wrote: (schemaname, procname, nargs) is still ambiguous in the face of function overloading. Although the presence of procid uniquely identifies each function anyway, if you're going to include the name and argument information, it might be worth including the argument types as well (as an array of regtype, perhaps). This is true. I was being a bit selfish here - in our environment we don't use overloading that much - schema, name and nargs are usually sufficient. It is also convinient to be able to do select * from pg... and have the output not to wrap around. From looking quickly at the patch, I don't think the current coding handles set-returning functions (ExecMakeTableFunctionResult). Hmm, should be handled. The number of tuples returned is not counted though, perhaps this is another thing to add ... Regards, Martin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] stored procedure stats in collector
Tom Lane wrote: I really dislike that approach to deciding which functions to count. The main problem with it is that it will try to count C-language functions that are added after initdb, such as contrib stuff and third-party add-ons like postgis. The percentage overhead for a typical short C function will be large, and I'm not sure anything much is to be gained by counting these. Agreed, it is a bit ugly. In the initial version there was no such exclusion. This came only after I accidentally activated the collection on a busy server. Checked cpu usage some half an hour later and immediately disabled it. System mode consumption was unacceptably high, but the results indicated that something funny was going on inside application as well - some internal functions such as texteq() and now() had unreasonably high call counts. I'm still investigating those issues, but maybe it might be useful to have stats for internal and C language functions as well? Perhaps it is possible to make the collection configurable per language. So that for instance, default is to only collect pl languages, with the option to add C or internal. Not sure how to approach this though. I think a more reasonable approach would be to count PL-language functions; which in turn suggests that the instrumentation hooks should be in the PL call handlers, not in ExecMakeFunctionResult and friends where they will drag down performance of all functions. It would be great if all the procedural languages would go through a single entry point. Right now the more exotic PL-s would need separate patching. But indeed, this approach would be less intrusive. BTW, I dunno if you've thought about the implications of inlining of SQL functions ... it's hard to see how to count those reasonably. Yes, this creates some inconsistencies in what is collected and what not. Unless, of course, only PL functions are counted :) Regards, Martin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] stored procedure stats in collector
Martin Pihlak [EMAIL PROTECTED] writes: I'm working on a patch to extend the stats collector to handle stored procedure statistics (call counts, duration etc). ... Only functions with oid = FirstNormalObjectId are accounted. I really dislike that approach to deciding which functions to count. The main problem with it is that it will try to count C-language functions that are added after initdb, such as contrib stuff and third-party add-ons like postgis. The percentage overhead for a typical short C function will be large, and I'm not sure anything much is to be gained by counting these. I think a more reasonable approach would be to count PL-language functions; which in turn suggests that the instrumentation hooks should be in the PL call handlers, not in ExecMakeFunctionResult and friends where they will drag down performance of all functions. BTW, I dunno if you've thought about the implications of inlining of SQL functions ... it's hard to see how to count those reasonably. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] stored procedure stats in collector
Howdy, I'm working on a patch to extend the stats collector to handle stored procedure statistics (call counts, duration etc). The goal is to make this information visible via pg_stat functions/views. The collection would be controllable via stats_function_level GUC and will have minimal overhead when turned off. At our company we depend heavily on stored procedures and such a tool would be of great help. Perhaps others could also find it somewhat useful. So far I have a working patch against 8.2.4 (attached) that places some instrumentation in the executor (ExecMakeFunctionResult, ExecMakeFunctionResultNoSets and ExecMakeTableFunctionResult) and send the stats to the collector (handled similarly to table stats). The results are visible through pg_stat_user_functions: martinp=# set stats_function_level = on; SET Time: 0.172 ms martinp=# select count(f3()) from generate_series(1, 1000); count --- 1000 (1 row) Time: 40059.713 ms martinp=# select * from pg_stat_user_functions ; procid | schemaname | procname | nargs | calls | total_time | total_cpu | self_time | self_cpu ++--+---+---++---+---+-- 16388 | public | f1 | 0 | 4000 | 14978 | 8352 | 14978 | 8352 16389 | public | f2 | 0 | 2000 | 40044 | 8364 | 25066 | 12 16390 | public | f3 | 0 | 1000 | 40054 | 8364 | 9 |0 (3 rows) Time units are in milliseconds. Only functions with oid = FirstNormalObjectId are accounted. This is of course still very experimental - some work is required to get this into production shape. I was hoping to get some feedback before I continue though. Maybe there are some obvious flaws that I'm not seeing, maybe something needs to be changed to make this more generally useful. Feedback is greatly appreciated. Regards, Martin 8.2.4-procstat.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] stored procedure stats in collector
On Fri, 2007-07-13 at 14:11 +0300, Martin Pihlak wrote: I'm working on a patch to extend the stats collector to handle stored procedure statistics (call counts, duration etc). The goal is to make this information visible via pg_stat functions/views. The collection would be controllable via stats_function_level GUC and will have minimal overhead when turned off. At our company we depend heavily on stored procedures and such a tool would be of great help. Perhaps others could also find it somewhat useful. Very cool, certainly sounds like a useful feature to me. martinp=# select * from pg_stat_user_functions ; procid | schemaname | procname | nargs | calls | total_time | total_cpu | self_time | self_cpu ++--+---+---++---+---+-- 16388 | public | f1 | 0 | 4000 | 14978 | 8352 | 14978 | 8352 16389 | public | f2 | 0 | 2000 | 40044 | 8364 | 25066 | 12 16390 | public | f3 | 0 | 1000 | 40054 | 8364 | 9 |0 (3 rows) (schemaname, procname, nargs) is still ambiguous in the face of function overloading. Although the presence of procid uniquely identifies each function anyway, if you're going to include the name and argument information, it might be worth including the argument types as well (as an array of regtype, perhaps). Only functions with oid = FirstNormalObjectId are accounted. Sounds reasonable to me; adding profiling to every DirectFunctionCall invocation is likely to be too expensive anyway. From looking quickly at the patch, I don't think the current coding handles set-returning functions (ExecMakeTableFunctionResult). -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org