Re: [HACKERS] stored procedure stats in collector

2008-03-25 Thread Bruce Momjian

This has been saved for the next commit-fest:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Martin Pihlak wrote:
 Howdy,
 
 Here's an updated version of the function stats patch. The biggest change is
 that by default only procedural language functions are counted. The GUC 
 variable
 stats_function_level now takes 3 values: on, off and all. The latter also
 counts SQL and C language functions, on means only to count procedural
 language functions. The decision is now made in fmgr_info_cxt_security()
 Trigger functions are now also counted.
 
 Sample output:
 
 select procname, calls, total_time, total_cpu, self_time, self_cpu
 from pg_stat_user_functions order by self_cpu desc limit 5;
 
procname  | calls | total_time | total_cpu | self_time | self_cpu
 +---++---+---+--
   next_batch | 32765 |  27139 |  8574 | 27139 | 8574
   fetch_batch_events |  3636 |   9252 |  5622 |  3771 | 2717
   batch_event_sql|  3636 |   5454 |  2888 |  3910 | 1962
   finish_batch   |  3420 |   3215 |  1475 |  3215 | 1475
   batch_event_tables |  3636 |   1448 |   865 |  1434 |  858
 (5 rows)
 
 This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
 usage here at Skype (about a month on reasonably busy boxes). So far so good.
 
 Couple of issues:
 - sometimes self_cpu  self_time - currently blaming it on Linux version of
 gettimeofday().
 - dropped functions are not purged from stats, might bloat the stats file for
 some usage patterns.
 
 PS. Would something like this be a canditate for 8.4 inclusion (if polished 
 up)?
 
 Regards,
 Martin
 

[ application/x-gzip is not supported, skipping... ]

 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedure stats in collector

2008-03-20 Thread Martin Pihlak
Howdy,

 Did you post an updated patch to HEAD?
 

No, but I guess its just about time. Hopefully I'll have
something to show next week.

regards,
Martin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] stored procedure stats in collector

2008-03-19 Thread Alvaro Herrera
Hi,

 Good. I'll bring the patch up to date with HEAD.

Did you post an updated patch to HEAD?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stored procedure stats in collector

2007-09-26 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Martin Pihlak wrote:
 Howdy,
 
 Here's an updated version of the function stats patch. The biggest change is
 that by default only procedural language functions are counted. The GUC 
 variable
 stats_function_level now takes 3 values: on, off and all. The latter also
 counts SQL and C language functions, on means only to count procedural
 language functions. The decision is now made in fmgr_info_cxt_security()
 Trigger functions are now also counted.
 
 Sample output:
 
 select procname, calls, total_time, total_cpu, self_time, self_cpu
 from pg_stat_user_functions order by self_cpu desc limit 5;
 
procname  | calls | total_time | total_cpu | self_time | self_cpu
 +---++---+---+--
   next_batch | 32765 |  27139 |  8574 | 27139 | 8574
   fetch_batch_events |  3636 |   9252 |  5622 |  3771 | 2717
   batch_event_sql|  3636 |   5454 |  2888 |  3910 | 1962
   finish_batch   |  3420 |   3215 |  1475 |  3215 | 1475
   batch_event_tables |  3636 |   1448 |   865 |  1434 |  858
 (5 rows)
 
 This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
 usage here at Skype (about a month on reasonably busy boxes). So far so good.
 
 Couple of issues:
 - sometimes self_cpu  self_time - currently blaming it on Linux version of
 gettimeofday().
 - dropped functions are not purged from stats, might bloat the stats file for
 some usage patterns.
 
 PS. Would something like this be a canditate for 8.4 inclusion (if polished 
 up)?
 
 Regards,
 Martin
 

[ application/x-gzip is not supported, skipping... ]

 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] stored procedure stats in collector

2007-09-26 Thread hubert depesz lubaczewski
On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote:
 That seems a confusing set of values. Perhaps off, pl, and all
 would be clearer?
 I'm curious if you've measured the performance overhead of enabling this
 functionality.

i'm quite worried about all setting.
all operators are functions as well, so tracking literally all function
calls might be too heavy.

on the other hand i would like to be able to track some non-pl functions
like RI%.

depesz

-- 
quicksil1er: postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV! :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stored procedure stats in collector

2007-09-26 Thread Martin Pihlak

hubert depesz lubaczewski wrote:

On Thu, Sep 20, 2007 at 05:34:32PM -0700, Neil Conway wrote:

That seems a confusing set of values. Perhaps off, pl, and all
would be clearer?
I'm curious if you've measured the performance overhead of enabling this
functionality.


i'm quite worried about all setting.
all operators are functions as well, so tracking literally all function
calls might be too heavy.


all would additionally count only C and SQL functions, builtins (language
internal) are skipped. Maybe there should be option to count these as well,
perhaps with a really scare name (track_functions = all_plus_internal).


on the other hand i would like to be able to track some non-pl functions
like RI%.


These are builtin.

Regards,
Martin



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] stored procedure stats in collector

2007-09-24 Thread Martin Pihlak

Neil Conway wrote:

On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote:

The GUC variable stats_function_level now takes 3 values: on, off and all.


That seems a confusing set of values. Perhaps off, pl, and all
would be clearer?


Makes sense. It appears that the stats_ prefixed GUC names are deprecated now.
Will rename to track_functions and change values to off, pl and
all. Or should I use none instead of off?


I'm curious if you've measured the performance overhead of enabling this
functionality.



Just finished a simple benchmark comparing unpatched, off and all:

unpatched usrsystotal
dummy 21.36  0.68   22.04
rfact 51.52  0.07   51.59
ifact 20.40  0.06   20.45

off usrsystotal  diff x
dummy 21.59  0.39   21.97  1.00
rfact 52.40  0.06   52.46  1.02
ifact 20.78  0.02   20.8   1.02

all usrsystotal  diff x
dummy 21.69  35.18  56.87  2.58
rfact 47.83  38.99  86.83  1.68
ifact 20.65  0.36   21.01  1.03

dummy is blank pl/pgsql function (ran 1000 times). rfact is recursive 
factorial,
ifact is iterative (both ran 100 times). Comparing median of 3 runs. The 
usr and
sys values are from log_executor_stats.

It seems that the overhead is unnoticeable if disabled, very visible for 
lightweight
functions and heavy callers. Almost unnoticeable for more compute intensive 
functions.



PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?


It sounds like a useful feature to me.


Good. I'll bring the patch up to date with HEAD.

Regards,
Martin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] stored procedure stats in collector

2007-09-24 Thread Tom Lane
Martin Pihlak [EMAIL PROTECTED] writes:
 Neil Conway wrote:
 That seems a confusing set of values. Perhaps off, pl, and all
 would be clearer?

 Makes sense. It appears that the stats_ prefixed GUC names are deprecated now.
 Will rename to track_functions and change values to off, pl and
 all. Or should I use none instead of off?

None seems good, by analogy with log_statement's values.

 It seems that the overhead is unnoticeable if disabled, very visible
 for lightweight functions and heavy callers. Almost unnoticeable for
 more compute intensive functions.

The very high system times seem odd.  Maybe you have a machine with
slow gettimeofday()?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] stored procedure stats in collector

2007-09-24 Thread Martin Pihlak

It seems that the overhead is unnoticeable if disabled, very visible
for lightweight functions and heavy callers. Almost unnoticeable for
more compute intensive functions.


The very high system times seem odd.  Maybe you have a machine with
slow gettimeofday()?



So it seems (on Ubuntu 7.04). Reran the benchmark on another system and
got more reasonable results:

none  usr  systotal
dummy   15.9 0.68   16.58
rfact   39.370.07   39.44
ifact   12.890.01   12.9

all   usr  systotaldiff x
dummy   20.824.44   25.261.52
rfact   45.174.17   49.341.25
ifact   12.920.03   12.961.00

Regards,
Martin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stored procedure stats in collector

2007-09-20 Thread Martin Pihlak

Howdy,

Here's an updated version of the function stats patch. The biggest change is
that by default only procedural language functions are counted. The GUC variable
stats_function_level now takes 3 values: on, off and all. The latter also
counts SQL and C language functions, on means only to count procedural
language functions. The decision is now made in fmgr_info_cxt_security()
Trigger functions are now also counted.

Sample output:

select procname, calls, total_time, total_cpu, self_time, self_cpu
from pg_stat_user_functions order by self_cpu desc limit 5;

  procname  | calls | total_time | total_cpu | self_time | self_cpu
+---++---+---+--
 next_batch | 32765 |  27139 |  8574 | 27139 | 8574
 fetch_batch_events |  3636 |   9252 |  5622 |  3771 | 2717
 batch_event_sql|  3636 |   5454 |  2888 |  3910 | 1962
 finish_batch   |  3420 |   3215 |  1475 |  3215 | 1475
 batch_event_tables |  3636 |   1448 |   865 |  1434 |  858
(5 rows)

This is still 8.2 only (tested on 8.2.4 and 8.2.5), has seen some production
usage here at Skype (about a month on reasonably busy boxes). So far so good.

Couple of issues:
- sometimes self_cpu  self_time - currently blaming it on Linux version of
gettimeofday().
- dropped functions are not purged from stats, might bloat the stats file for
some usage patterns.

PS. Would something like this be a canditate for 8.4 inclusion (if polished up)?

Regards,
Martin



8.2-procstat.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] stored procedure stats in collector

2007-09-20 Thread Neil Conway
On Thu, 2007-09-20 at 16:08 +0300, Martin Pihlak wrote:
 The GUC variable stats_function_level now takes 3 values: on, off and all.

That seems a confusing set of values. Perhaps off, pl, and all
would be clearer?

I'm curious if you've measured the performance overhead of enabling this
functionality.

 PS. Would something like this be a canditate for 8.4 inclusion (if polished 
 up)?

It sounds like a useful feature to me.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] stored procedure stats in collector

2007-07-16 Thread Martin Pihlak

Neil Conway wrote:


(schemaname, procname, nargs) is still ambiguous in the face of function
overloading. Although the presence of procid uniquely identifies each
function anyway, if you're going to include the name and argument
information, it might be worth including the argument types as well (as
an array of regtype, perhaps).



This is true. I was being a bit selfish here - in our environment we don't
use overloading that much - schema, name and nargs are usually sufficient.
It is also convinient to be able to do select * from pg... and have the
output not to wrap around.


From looking quickly at the patch, I don't think the current coding

handles set-returning functions (ExecMakeTableFunctionResult).



Hmm, should be handled. The number of tuples returned is not counted though,
perhaps this is another thing to add ...

Regards,
Martin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] stored procedure stats in collector

2007-07-16 Thread Martin Pihlak

Tom Lane wrote:

I really dislike that approach to deciding which functions to count.
The main problem with it is that it will try to count C-language
functions that are added after initdb, such as contrib stuff and
third-party add-ons like postgis.  The percentage overhead for a
typical short C function will be large, and I'm not sure anything
much is to be gained by counting these.


Agreed, it is a bit ugly. In the initial version there was no such
exclusion. This came only after I accidentally activated the
collection on a busy server. Checked cpu usage some half an hour later
and immediately disabled it. System mode consumption was unacceptably
high, but the results indicated that something funny was going
on inside application as well - some internal functions such as texteq()
and now() had unreasonably high call counts. I'm still investigating
those issues, but maybe it might be useful to have stats for internal
and C language functions as well?

Perhaps it is possible to make the collection configurable per language.
So that for instance, default is to only collect pl languages, with the
option to add C or internal. Not sure how to approach this though.


I think a more reasonable approach would be to count PL-language
functions; which in turn suggests that the instrumentation hooks
should be in the PL call handlers, not in ExecMakeFunctionResult
and friends where they will drag down performance of all functions.


It would be great if all the procedural languages would go through a
single entry point. Right now the more exotic PL-s would need separate
patching. But indeed, this approach would be less intrusive.


BTW, I dunno if you've thought about the implications of inlining
of SQL functions ... it's hard to see how to count those reasonably.


Yes, this creates some inconsistencies in what is collected and what not.
Unless, of course, only PL functions are counted :)

Regards,
Martin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] stored procedure stats in collector

2007-07-15 Thread Tom Lane
Martin Pihlak [EMAIL PROTECTED] writes:
 I'm working on a patch to extend the stats collector to handle stored 
 procedure
 statistics (call counts, duration etc).
 ...
 Only functions with oid = FirstNormalObjectId are accounted.

I really dislike that approach to deciding which functions to count.
The main problem with it is that it will try to count C-language
functions that are added after initdb, such as contrib stuff and
third-party add-ons like postgis.  The percentage overhead for a
typical short C function will be large, and I'm not sure anything
much is to be gained by counting these.

I think a more reasonable approach would be to count PL-language
functions; which in turn suggests that the instrumentation hooks
should be in the PL call handlers, not in ExecMakeFunctionResult
and friends where they will drag down performance of all functions.

BTW, I dunno if you've thought about the implications of inlining
of SQL functions ... it's hard to see how to count those reasonably.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] stored procedure stats in collector

2007-07-13 Thread Martin Pihlak

Howdy,

I'm working on a patch to extend the stats collector to handle stored procedure
statistics (call counts, duration etc). The goal is to make this information
visible via pg_stat functions/views. The collection would be controllable via
stats_function_level GUC and will have minimal overhead when turned off. At
our company we depend heavily on stored procedures and such a tool would be of
great help. Perhaps others could also find it somewhat useful.

So far I have a working patch against 8.2.4 (attached) that places some
instrumentation in the executor (ExecMakeFunctionResult,
ExecMakeFunctionResultNoSets and ExecMakeTableFunctionResult) and send the
stats to the collector (handled similarly to table stats).  The results are
visible through pg_stat_user_functions:


martinp=# set stats_function_level = on;
SET
Time: 0.172 ms

martinp=# select count(f3()) from generate_series(1, 1000);
 count
---
  1000
(1 row)

Time: 40059.713 ms

martinp=# select * from pg_stat_user_functions ;
 procid | schemaname | procname | nargs | calls | total_time | total_cpu | 
self_time | self_cpu
++--+---+---++---+---+--
  16388 | public | f1   | 0 |  4000 |  14978 |  8352 | 
14978 | 8352
  16389 | public | f2   | 0 |  2000 |  40044 |  8364 | 
25066 |   12
  16390 | public | f3   | 0 |  1000 |  40054 |  8364 |  
   9 |0
(3 rows)

Time units are in milliseconds. Only functions with oid = FirstNormalObjectId
are accounted.

This is of course still very experimental - some work is required to get this
into production shape. I was hoping to get some feedback before I continue
though. Maybe there are some obvious flaws that I'm not seeing, maybe something
needs to be changed to make this more generally useful. Feedback is greatly
appreciated.

Regards,
Martin


8.2.4-procstat.patch.gz
Description: GNU Zip compressed data

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] stored procedure stats in collector

2007-07-13 Thread Neil Conway
On Fri, 2007-07-13 at 14:11 +0300, Martin Pihlak wrote:
 I'm working on a patch to extend the stats collector to handle stored 
 procedure
 statistics (call counts, duration etc). The goal is to make this information
 visible via pg_stat functions/views. The collection would be controllable via
 stats_function_level GUC and will have minimal overhead when turned off. At
 our company we depend heavily on stored procedures and such a tool would be of
 great help. Perhaps others could also find it somewhat useful.

Very cool, certainly sounds like a useful feature to me.

 martinp=# select * from pg_stat_user_functions ;
   procid | schemaname | procname | nargs | calls | total_time | total_cpu | 
 self_time | self_cpu
 ++--+---+---++---+---+--
16388 | public | f1   | 0 |  4000 |  14978 |  8352 |   
   14978 | 8352
16389 | public | f2   | 0 |  2000 |  40044 |  8364 |   
   25066 |   12
16390 | public | f3   | 0 |  1000 |  40054 |  8364 |   
   9 |0
 (3 rows)

(schemaname, procname, nargs) is still ambiguous in the face of function
overloading. Although the presence of procid uniquely identifies each
function anyway, if you're going to include the name and argument
information, it might be worth including the argument types as well (as
an array of regtype, perhaps).

 Only functions with oid = FirstNormalObjectId are accounted.

Sounds reasonable to me; adding profiling to every DirectFunctionCall
invocation is likely to be too expensive anyway.

From looking quickly at the patch, I don't think the current coding
handles set-returning functions (ExecMakeTableFunctionResult).

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org