Re: [HACKERS] Generate call graphs in run-time
On 01/17/2012 11:13 AM, Joel Jacobson wrote: Since you only care about the parentfuncid in one level, it looks like you will only be able to get a total call graph of all possible function calls, and not each unique call graph per transaction. True, for my purposes (function dependencies and performance analysis) the global graph is sufficient. Also, why remove the parent oid when uploading the statistics to the collector? It would be nice to have the statistics for each function per parent, to see where you have a bottleneck which might only be occurring in a function when called from a specific parent. I guess I was just lazy at the time I wrote it. But it shouldn't be too much of an effort to store the global call graph in statistics collector. Unique call graphs would be somewhat more complicated I guess. There is a patch for this and we do use it in production for occasional troubleshooting and dependency analysis. Can't attach immediately though -- it has some extra cruft in it that needs to be cleaned up. I would highly appreciate a patch, don't worry about cleaning up, I can do that, unless it's some code you can't share for other reasons. Patch attached. It was developed against 9.1, but also applies to HEAD but gives some fuzz and offsets. It adds 2 GUC variables: log_function_calls and log_usage_stats. The first just output function statistics to log (with no parent info). With log_usage_stats enabled, it also outputs detailed function usage plus relation usage. So you basically get output such as: # select * from pgq.get_consumer_info(); LOG: duration: 11.153 ms statement: select * from pgq.get_consumer_info(); LOG: function call: pgq.get_consumer_info(0) calls=1 time=9726 self_time=536 LOG: USAGE STATISTICS DETAIL: ! object access stats: ! F 1892464226 0 pgq.get_consumer_info(0) calls=1 time=9726 self_time=536 ! F 1892464228 1892464226 pgq.get_consumer_info(2) calls=1 time=9190 self_time=9190 ! r 167558000 pgq.queue: blks_read=28 blks_hit=28 ! r 167557988 pgq.consumer: blks_read=56 blks_hit=56 ! r 167558021 pgq.subscription: blks_read=54 blks_hit=50 ! r 167558050 pgq.tick: blks_read=103 blks_hit=102 ! i 1892464189 pgq.queue_pkey: scans=1 tup_ret=37 tup_fetch=37 blks_read=2 blks_hit=2 ! i 167557995 pgq.consumer_pkey: scans=56 tup_ret=56 tup_fetch=56 blks_read=57 blks_hit=56 ! i 1892464195 pgq.subscription_pkey: scans=37 tup_ret=156 tup_fetch=56 blks_read=127 blks_hit=123 ! i 167558058 pgq.tick_pkey: scans=112 tup_ret=103 tup_fetch=103 blks_read=367 blks_hit=366 funcid-parentfuncid might be sufficient for performance optimizations, but to automatically generate directional graphs of all unique call graphs in run-time, you would need all the unique pairs of funcid-parentfuncid as a singel column, probably a sorted array of oids[][], example: [[1,2],[1,3],[2,4],[2,5]] if the call craph would be {1-2, 1-3, 2-4, 2-5}. Hmm, array would probably work, although I wonder if there is a more elegant solution ... regards, Martin diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 5ed6e83..2e66020 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -42,6 +42,7 @@ #include access/twophase_rmgr.h #include access/xact.h #include catalog/pg_database.h +#include catalog/pg_namespace.h #include catalog/pg_proc.h #include libpq/ip.h #include libpq/libpq.h @@ -63,7 +64,8 @@ #include utils/ps_status.h #include utils/rel.h #include utils/tqual.h - +#include utils/syscache.h +#include utils/lsyscache.h /* -- * Paths for the statistics files (relative to installation's $PGDATA). @@ -113,6 +115,12 @@ bool pgstat_track_counts = false; int pgstat_track_functions = TRACK_FUNC_OFF; int pgstat_track_activity_query_size = 1024; +/* + * Last function call parent. InvalidOid is used to indicate + * top-level functions. + */ +Oid current_function_parent = InvalidOid; + /* -- * Built from GUC parameter * -- @@ -258,8 +266,11 @@ static void pgstat_read_current_status(void); static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg); static void pgstat_send_funcstats(void); static HTAB *pgstat_collect_oids(Oid catalogid); +static void pgstat_report_functions(void); +static void pgstat_report_usage(void); static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared); +static PgStat_TableStatus *new_tsa_entry(TabStatusArray *entry, Oid rel_id, bool isshared); static void pgstat_setup_memcxt(void); @@ -697,6 +708,12 @@ pgstat_report_stat(bool force) last_report = now; /* + * First report function calls and object usage. + */ + pgstat_report_functions(); + pgstat_report_usage(); + + /* * Scan through the TabStatusArray struct(s) to find tables that actually * have counts, and build messages to send. We have to separate shared * relations from regular ones because the databaseid field in the message @@ -804,9 +821,6 @@
Re: [HACKERS] Generate call graphs in run-time
On Mon, Jan 16, 2012 at 2:23 PM, Martin Pihlak martin.pih...@gmail.com wrote: My approach was to add parent oid to the per-backend function stats structure - PgStat_BackendFunctionEntry. Also, I changed the hash key for that structure to (oid, parent) pair. This means that within the backend the function usage is always tracked with the context of calling function. This has the nice property that you get the per-parent usage stats as well. Also the additional lists for parent tracking are avoided. During pgstat_report_stat() the call graph (with stats) is output to logs and the statistics uploaded to collector -- with the parent oid removed. Since you only care about the parentfuncid in one level, it looks like you will only be able to get a total call graph of all possible function calls, and not each unique call graph per transaction. If you have two separate transactions: T1: a-b, b-c T2: b-d You would have two unique call graphs {a-b, b-c} and {b-d}. The global call graph, where you only keep track of all unique parentfuncid-funcid pairs, would be {a-b, b-c, b-d}, which lacks the information on what different call graphs are actually being executed per transaction. Also, why remove the parent oid when uploading the statistics to the collector? It would be nice to have the statistics for each function per parent, to see where you have a bottleneck which might only be occurring in a function when called from a specific parent. Even more fine-grained would be to have the statistics per unique call-graph, i.e. the entire tree of functions called in the transactions. There is a patch for this and we do use it in production for occasional troubleshooting and dependency analysis. Can't attach immediately though -- it has some extra cruft in it that needs to be cleaned up. I would highly appreciate a patch, don't worry about cleaning up, I can do that, unless it's some code you can't share for other reasons. Indeed. Something like a pg_stat_user_function_details view would be very useful. Something along the lines of: Column | Type | --++ funcid | oid | parentfuncid | oid | -- new schemaname | name | funcname | name | calls | bigint | total_time | bigint | self_time | bigint | funcid-parentfuncid might be sufficient for performance optimizations, but to automatically generate directional graphs of all unique call graphs in run-time, you would need all the unique pairs of funcid-parentfuncid as a singel column, probably a sorted array of oids[][], example: [[1,2],[1,3],[2,4],[2,5]] if the call craph would be {1-2, 1-3, 2-4, 2-5}. And then rewrite pg_stat_user_functions by aggregating the detailed view. That'd make the individual pg_stat_get_function* functions a bit slower, but that is probably a non-issue - at least not if the pg_stat_user_functions view is rewritten to use a SRF. regards, Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generate call graphs in run-time
On 01/09/2012 10:00 PM, Joel Jacobson wrote: Because of this I decided to sample data in run-time to get a real-life picture of the system. Any functions not being called in productions are not that important to include in the documentation anyway. FWIW I have a similar problem - with a similar solution. I'll try to describe it, in case there is some synergy to be leveraged :) In pgstat_init_function_usage(), each call to a function will add the caller-called oid pair unless already set in the hash. When pgstat_end_function_usage() is called, the current function oid is removed from the List of parent oids. In AtEOXact_PgStat(), called upon commit/rollback, the call graph is sorted and written to the log unless already seen in the session. The variables are resetted in pgstat_clear_snapshot(). My approach was to add parent oid to the per-backend function stats structure - PgStat_BackendFunctionEntry. Also, I changed the hash key for that structure to (oid, parent) pair. This means that within the backend the function usage is always tracked with the context of calling function. This has the nice property that you get the per-parent usage stats as well. Also the additional lists for parent tracking are avoided. During pgstat_report_stat() the call graph (with stats) is output to logs and the statistics uploaded to collector -- with the parent oid removed. This functionality is probably something one would like to enable only temporarily in the production environment. A new configuration parameter would therefore be good, just like track_functions. Perhaps track_callgraphs? I opted to always track the function parents -- this is very cheap. The logging on the other hand is quite heavy and needs to be explicitly configured (I used log_usage_stats GUC). There is a patch for this and we do use it in production for occasional troubleshooting and dependency analysis. Can't attach immediately though -- it has some extra cruft in it that needs to be cleaned up. Instead of writing the call graphs to the postgres log, it would ne more useful to let the statistics collector keep track of the call graphs, to allow easier access than having to parse through the log file. Indeed. Something like a pg_stat_user_function_details view would be very useful. Something along the lines of: Column | Type | --++ funcid | oid| parentfuncid | oid| -- new schemaname | name | funcname | name | calls| bigint | total_time | bigint | self_time| bigint | And then rewrite pg_stat_user_functions by aggregating the detailed view. That'd make the individual pg_stat_get_function* functions a bit slower, but that is probably a non-issue - at least not if the pg_stat_user_functions view is rewritten to use a SRF. regards, Martin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generate call graphs in run-time
Is this only to print out the stack upon errors? Looks like the stack is in the variable error_context_stack. Is it always available containing all the parent functions, even when there is no error? Can I reach it from within pgstat.c? 2012/1/10 Jim Nasby j...@nasby.net On Jan 9, 2012, at 2:08 PM, Joel Jacobson wrote: Generates call graphs of function calls within a transaction in run-time. Related to this... we had Command Prompt write a function for us that would spit out the complete call-graph of the current call stack whenever it was called. Alvaro didn't need to add any hooks to the backend to accomplish this, so it would seem that the call stack is already available. That might simplify what you're doing. I don't see this posted on pgFoundry yet, so I've attached it. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Joel Jacobson Trustly +46703603801 https://trustly.com
Re: [HACKERS] Generate call graphs in run-time
Joel Jacobson j...@gluefinance.com wrote: The perl script pg_callgraph.pl replaces the oids with actual function names before generating the call graphs using GraphVIz: Regardless of anything else, I think you need to allow for function overloading. You could cover that, I think, by replacing this literal in your perl script: SELECT oid, proname FROM pg_proc with this literal: SELECT oid, oid::regprocedure::text FROM pg_proc -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Generate call graphs in run-time
On Jan 9, 2012, at 2:08 PM, Joel Jacobson wrote: Generates call graphs of function calls within a transaction in run-time. Related to this... we had Command Prompt write a function for us that would spit out the complete call-graph of the current call stack whenever it was called. Alvaro didn't need to add any hooks to the backend to accomplish this, so it would seem that the call stack is already available. That might simplify what you're doing. I don't see this posted on pgFoundry yet, so I've attached it. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net stack.tgz Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers