Re: [HACKERS] Generate call graphs in run-time

2012-01-18 Thread Martin Pihlak
On 01/17/2012 11:13 AM, Joel Jacobson wrote:
 Since you only care about the parentfuncid in one level, it looks like
 you will only be able to get a total call graph of all possible
 function calls, and not each unique call graph per transaction.

True, for my purposes (function dependencies and performance analysis)
the global graph is sufficient.

 Also, why remove the parent oid when uploading the statistics to the 
 collector?
 It would be nice to have the statistics for each function per parent,
 to see where you have a bottleneck which might only be occurring in a
 function when called from a specific parent.

I guess I was just lazy at the time I wrote it. But it shouldn't be too
much of an effort to store the global call graph in statistics
collector. Unique call graphs would be somewhat more complicated I
guess.

 There is a patch for this and we do use it in production for occasional
 troubleshooting and dependency analysis. Can't attach immediately
 though -- it has some extra cruft in it that needs to be cleaned up.
 
 I would highly appreciate a patch, don't worry about cleaning up, I
 can do that, unless it's some code you can't share for other reasons.
 

Patch attached. It was developed against 9.1, but also applies to
HEAD but gives some fuzz and offsets.

It adds 2 GUC variables: log_function_calls and log_usage_stats. The
first just output function statistics to log (with no parent info).
With log_usage_stats enabled, it also outputs detailed function usage
plus relation usage. So you basically get output such as:

# select * from pgq.get_consumer_info();
LOG:  duration: 11.153 ms  statement: select * from pgq.get_consumer_info();
LOG:  function call: pgq.get_consumer_info(0) calls=1 time=9726
self_time=536
LOG:  USAGE STATISTICS
DETAIL:  ! object access stats:
! F 1892464226 0 pgq.get_consumer_info(0) calls=1 time=9726 self_time=536
! F 1892464228 1892464226 pgq.get_consumer_info(2) calls=1 time=9190
self_time=9190
! r 167558000 pgq.queue: blks_read=28 blks_hit=28
! r 167557988 pgq.consumer: blks_read=56 blks_hit=56
! r 167558021 pgq.subscription: blks_read=54 blks_hit=50
! r 167558050 pgq.tick: blks_read=103 blks_hit=102
! i 1892464189 pgq.queue_pkey: scans=1 tup_ret=37 tup_fetch=37
blks_read=2 blks_hit=2
! i 167557995 pgq.consumer_pkey: scans=56 tup_ret=56 tup_fetch=56
blks_read=57 blks_hit=56
! i 1892464195 pgq.subscription_pkey: scans=37 tup_ret=156 tup_fetch=56
blks_read=127 blks_hit=123
! i 167558058 pgq.tick_pkey: scans=112 tup_ret=103 tup_fetch=103
blks_read=367 blks_hit=366

 funcid-parentfuncid might be sufficient for performance
 optimizations, but to automatically generate directional graphs of all
 unique call graphs in run-time, you would need all the unique pairs of
 funcid-parentfuncid as a singel column, probably a sorted array of
 oids[][], example: [[1,2],[1,3],[2,4],[2,5]] if the call craph would
 be {1-2, 1-3, 2-4, 2-5}.
 

Hmm, array would probably work, although I wonder if there is a
more elegant solution ...

regards,
Martin
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 5ed6e83..2e66020 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -42,6 +42,7 @@
 #include access/twophase_rmgr.h
 #include access/xact.h
 #include catalog/pg_database.h
+#include catalog/pg_namespace.h
 #include catalog/pg_proc.h
 #include libpq/ip.h
 #include libpq/libpq.h
@@ -63,7 +64,8 @@
 #include utils/ps_status.h
 #include utils/rel.h
 #include utils/tqual.h
-
+#include utils/syscache.h
+#include utils/lsyscache.h
 
 /* --
  * Paths for the statistics files (relative to installation's $PGDATA).
@@ -113,6 +115,12 @@ bool		pgstat_track_counts = false;
 int			pgstat_track_functions = TRACK_FUNC_OFF;
 int			pgstat_track_activity_query_size = 1024;
 
+/*
+ * Last function call parent. InvalidOid is used to indicate
+ * top-level functions.
+ */
+Oid current_function_parent = InvalidOid;
+
 /* --
  * Built from GUC parameter
  * --
@@ -258,8 +266,11 @@ static void pgstat_read_current_status(void);
 static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static HTAB *pgstat_collect_oids(Oid catalogid);
+static void pgstat_report_functions(void);
+static void pgstat_report_usage(void);
 
 static PgStat_TableStatus *get_tabstat_entry(Oid rel_id, bool isshared);
+static PgStat_TableStatus *new_tsa_entry(TabStatusArray *entry, Oid rel_id, bool isshared);
 
 static void pgstat_setup_memcxt(void);
 
@@ -697,6 +708,12 @@ pgstat_report_stat(bool force)
 	last_report = now;
 
 	/*
+	 * First report function calls and object usage.
+	 */
+	pgstat_report_functions();
+	pgstat_report_usage();
+
+	/*
 	 * Scan through the TabStatusArray struct(s) to find tables that actually
 	 * have counts, and build messages to send.  We have to separate shared
 	 * relations from regular ones because the databaseid field in the message
@@ -804,9 +821,6 @@ 

Re: [HACKERS] Generate call graphs in run-time

2012-01-17 Thread Joel Jacobson
On Mon, Jan 16, 2012 at 2:23 PM, Martin Pihlak martin.pih...@gmail.com wrote:
 My approach was to add parent oid to the per-backend function stats
 structure - PgStat_BackendFunctionEntry. Also, I changed the hash key
 for that structure to (oid, parent) pair. This means that within the
 backend the function usage is always tracked with the context of
 calling function. This has the nice property that you get the per-parent
 usage stats as well. Also the additional lists for parent tracking are
 avoided.

 During pgstat_report_stat() the call graph (with stats) is output
 to logs and the statistics uploaded to collector -- with the parent oid
 removed.

Since you only care about the parentfuncid in one level, it looks like
you will only be able to get a total call graph of all possible
function calls, and not each unique call graph per transaction.
If you have two separate transactions:
T1: a-b, b-c
T2: b-d
You would have two unique call graphs {a-b, b-c} and {b-d}.
The global call graph, where you only keep track of all unique
parentfuncid-funcid pairs, would be {a-b, b-c, b-d}, which lacks
the information on what different call graphs are actually being
executed per transaction.

Also, why remove the parent oid when uploading the statistics to the collector?
It would be nice to have the statistics for each function per parent,
to see where you have a bottleneck which might only be occurring in a
function when called from a specific parent.
Even more fine-grained would be to have the statistics per unique
call-graph, i.e. the entire tree of functions called in the
transactions.

 There is a patch for this and we do use it in production for occasional
 troubleshooting and dependency analysis. Can't attach immediately
 though -- it has some extra cruft in it that needs to be cleaned up.

I would highly appreciate a patch, don't worry about cleaning up, I
can do that, unless it's some code you can't share for other reasons.

 Indeed. Something like a pg_stat_user_function_details view would be
 very useful. Something along the lines of:

   Column     |  Type  |
 --++
  funcid       | oid    |
  parentfuncid | oid    | -- new
  schemaname   | name   |
  funcname     | name   |
  calls        | bigint |
  total_time   | bigint |
  self_time    | bigint |

funcid-parentfuncid might be sufficient for performance
optimizations, but to automatically generate directional graphs of all
unique call graphs in run-time, you would need all the unique pairs of
funcid-parentfuncid as a singel column, probably a sorted array of
oids[][], example: [[1,2],[1,3],[2,4],[2,5]] if the call craph would
be {1-2, 1-3, 2-4, 2-5}.


 And then rewrite pg_stat_user_functions by aggregating the detailed
 view. That'd make the individual pg_stat_get_function* functions a
 bit slower, but that is probably a non-issue - at least not if the
 pg_stat_user_functions view is rewritten to use a SRF.

 regards,
 Martin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generate call graphs in run-time

2012-01-16 Thread Martin Pihlak
On 01/09/2012 10:00 PM, Joel Jacobson wrote:
 Because of this I decided to sample data in run-time to get a real-life
 picture of the system.
 Any functions not being called in productions are not that important to
 include in the documentation anyway.

FWIW I have a similar problem - with a similar solution. I'll try to
describe it, in case there is some synergy to be leveraged :)

 In pgstat_init_function_usage(), each call to a function will add the
 caller-called oid pair unless already set in the hash.
 When pgstat_end_function_usage() is called, the current function oid is
 removed from the List of parent oids.
 In AtEOXact_PgStat(), called upon commit/rollback, the call graph is
 sorted and written to the log unless already seen in the session.
 The variables are resetted in pgstat_clear_snapshot().
 

My approach was to add parent oid to the per-backend function stats
structure - PgStat_BackendFunctionEntry. Also, I changed the hash key
for that structure to (oid, parent) pair. This means that within the
backend the function usage is always tracked with the context of
calling function. This has the nice property that you get the per-parent
usage stats as well. Also the additional lists for parent tracking are
avoided.

During pgstat_report_stat() the call graph (with stats) is output
to logs and the statistics uploaded to collector -- with the parent oid
removed.

 This functionality is probably something one would like to enable only
 temporarily in the production environment.
 A new configuration parameter would therefore be good, just like
 track_functions. Perhaps track_callgraphs?
 

I opted to always track the function parents -- this is very cheap. The
logging on the other hand is quite heavy and needs to be explicitly
configured (I used log_usage_stats GUC).

There is a patch for this and we do use it in production for occasional
troubleshooting and dependency analysis. Can't attach immediately
though -- it has some extra cruft in it that needs to be cleaned up.

 Instead of writing the call graphs to the postgres log, it would ne more
 useful to let the statistics collector keep track of the call graphs, to
 allow easier access than having to parse through the log file.

Indeed. Something like a pg_stat_user_function_details view would be
very useful. Something along the lines of:

   Column |  Type  |
--++
 funcid   | oid|
 parentfuncid | oid| -- new
 schemaname   | name   |
 funcname | name   |
 calls| bigint |
 total_time   | bigint |
 self_time| bigint |

And then rewrite pg_stat_user_functions by aggregating the detailed
view. That'd make the individual pg_stat_get_function* functions a
bit slower, but that is probably a non-issue - at least not if the
pg_stat_user_functions view is rewritten to use a SRF.

regards,
Martin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generate call graphs in run-time

2012-01-10 Thread Joel Jacobson
Is this only to print out the stack upon errors? Looks like the stack is in
the variable error_context_stack. Is it always available containing all the
parent functions, even when there is no error? Can I reach it from within
pgstat.c?

2012/1/10 Jim Nasby j...@nasby.net

 On Jan 9, 2012, at 2:08 PM, Joel Jacobson wrote:
  Generates call graphs of function calls within a transaction in run-time.

 Related to this... we had Command Prompt write a function for us that
 would spit out the complete call-graph of the current call stack whenever
 it was called. Alvaro didn't need to add any hooks to the backend to
 accomplish this, so it would seem that the call stack is already available.
 That might simplify what you're doing.

 I don't see this posted on pgFoundry yet, so I've attached it.
 --
 Jim C. Nasby, Database Architect   j...@nasby.net
 512.569.9461 (cell) http://jim.nasby.net




-- 
Joel Jacobson
Trustly
+46703603801
https://trustly.com


Re: [HACKERS] Generate call graphs in run-time

2012-01-09 Thread Kevin Grittner
Joel Jacobson j...@gluefinance.com wrote:
 
 The perl script pg_callgraph.pl replaces the oids with actual
 function names before generating the call graphs using GraphVIz:
 
Regardless of anything else, I think you need to allow for function
overloading.  You could cover that, I think, by replacing this
literal in your perl script:
 
SELECT oid, proname FROM pg_proc
 
with this literal:
 
SELECT oid, oid::regprocedure::text FROM pg_proc
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Generate call graphs in run-time

2012-01-09 Thread Jim Nasby
On Jan 9, 2012, at 2:08 PM, Joel Jacobson wrote:
 Generates call graphs of function calls within a transaction in run-time.

Related to this... we had Command Prompt write a function for us that would 
spit out the complete call-graph of the current call stack whenever it was 
called. Alvaro didn't need to add any hooks to the backend to accomplish this, 
so it would seem that the call stack is already available. That might simplify 
what you're doing.

I don't see this posted on pgFoundry yet, so I've attached it.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



stack.tgz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers