Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Greg Smith

On Sun, 7 Dec 2008, Alex Hunsaker wrote:


(dual core machine, --enable-debug, --enable-cassert build)
pgbench -c 2 -T60 -n -f test.sql

HEAD: tps = 9.674423
PATCH: tps = 9.695784


Two general suggestions here, not specific to this patch:

While it's good to do most testing with debug and cassert turned on, you 
shouldn't report performance results with those two flags enabled.  What 
if the patch has some large amount of overhead that only shows up when 
compiled with debug or asserts on?  You'd end up reporting a performance 
loss that doesn't actually exist in a real build.  Unfortunately, the only 
way to get good performance results is to have a parallel build done with 
those off, in addition to the debug/assert one used to catch bugs.


The above pgbench is executing less than 600 actual tests (60 seconds @ 
9.7TPS).  That seems a bit short to me.  If you sorted out the above and 
run this again, it would be good to let pgbench run for a lot longer than 
1 minute, to see if the results show some more significant difference. 
With this few TPS, it would be nice to let that run for 30 minutes or more 
if you can find some time to schedule that.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 I thought that output of new counters are too wide and it brakes
 compatibility of EXPLAIN ANALYZE. On the other hand, we don't have to
 think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
 added in 8.4. However, overheads should be avoided. We could have
 two kinds of instrumentations, time-only or all-stats.

I've got a serious problem with the way that this patch is being
presented.  It's being named and described as though it's just another
contrib module, but in fact it makes invasive, undocumented changes to
the behavior of the core EXPLAIN functionality --- changes that
certainly cannot be claimed to having been agreed to by the community,
since most of us probably weren't aware that there was any such thing
going on inside this patch.

Please split this into two separate patches that can be separately
evaluated.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Robert Haas
 On the other hand, we don't have to
 think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
 added in 8.4.

Uh, it exists for me in 8.2.9.

Welcome to psql 8.2.9, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

portal=# explain analyze verbose select 1;
 QUERY PLAN



{RESULT
:startup_cost 0.00
:total_cost 0.01
:plan_rows 1
:plan_width 0
:targetlist (
   {TARGETENTRY
   :expr
  {CONST
  :consttype 23
  :constlen 4
  :constbyval true
  :constisnull false
  :constvalue 4 [ 1 0 0 0 ]
  }
   :resno 1
   :resname ?column?
   :ressortgroupref 0
   :resorigtbl 0
   :resorigcol 0
   :resjunk false
   }
)
:qual 
:lefttree 
:righttree 
:initPlan 
:extParam (b)
:allParam (b)
:nParamExec 0
:resconstantqual 
}

 Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.010..0.012 rows=1 loops
=1)
 Total runtime: 0.244 ms
(35 rows)

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Gregory Stark
Robert Haas [EMAIL PROTECTED] writes:

 On the other hand, we don't have to
 think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
 added in 8.4.

 Uh, it exists for me in 8.2.9.

The current behaviour is newly added in 8.4. In 8.2 it meant something
completely different and quite useless for end-users in any case, so backwards
compatibility isn't important.

What strikes me as a convenient approach is basically using EXPLAIN VERBOSE as
a playground where we feel free to add everything we think of. If people run a
command marked VERBOSE and complain it prints too much...

As stuff matures and becomes indispensable we could consider moving it to the
regular EXPLAIN or implement some way to specify precisely which data the user
wants. Or just say XML/table data/whatever will solve the problem for us.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Tom Lane
Robert Haas [EMAIL PROTECTED] writes:
 On the other hand, we don't have to
 think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
 added in 8.4.

 Uh, it exists for me in 8.2.9.

EXPLAIN VERBOSE has existed at least back to 7.0, probably further.
However, we've felt free to whack around what it outputs, so maybe
the backwards-compatibility issue isn't very strong.

A possibly stronger complaint is that ANALYZE and VERBOSE have always
been orthogonal options to EXPLAIN, and now there'd be some interaction
between them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Robert Haas
 As stuff matures and becomes indispensable we could consider moving it to the
 regular EXPLAIN or implement some way to specify precisely which data the user
 wants. Or just say XML/table data/whatever will solve the problem for us.

I think some way to specify precisely which data the user wants is the
way to go.  The amount of data that there is to be printed is only
going to continue to increase.  If the only toggle is a boolean flag
to display ALL or NONE of it, then every time someone proposes a new
type of output, we're going to argue about whether it's useful enough
to be worth the display real estate.

I'm not sure what the best way is though.  I don't think continuing to
add keywords between EXPLAIN and the start of the query is very
scalable.  Putting parentheses around the option list seems like it
might eliminate a lot of grammar headaches:

EXPLAIN (option, option, option...) SELECT ...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Vladimir Sitnikov

 I'm not sure what the best way is though.  I don't think continuing to
 add keywords between EXPLAIN and the start of the query is very
 scalable.  Putting parentheses around the option list seems like it
 might eliminate a lot of grammar headaches:

Do you think it is required to invent special grammar just for presentation
purposes?

I guess database should not deal with presentation. Provided explain
retuns table, it is up to the client to do the formatting. I do not believe
it makes sense creating several different explain outputs, and redo all the
work in 8.5.

It still could make sense having several options for explain if that would
result in *different instrumentation *(e.g. explain vs explain analyze).



Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Alex Hunsaker
On Tue, Dec 9, 2008 at 01:20, Greg Smith [EMAIL PROTECTED] wrote:
 On Sun, 7 Dec 2008, Alex Hunsaker wrote:

 (dual core machine, --enable-debug, --enable-cassert build)
 pgbench -c 2 -T60 -n -f test.sql

 HEAD: tps = 9.674423
 PATCH: tps = 9.695784

 Two general suggestions here, not specific to this patch:

 While it's good to do most testing with debug and cassert turned on, you
 shouldn't report performance results with those two flags enabled.  What if
 the patch has some large amount of overhead that only shows up when compiled
 with debug or asserts on?  You'd end up reporting a performance loss that
 doesn't actually exist in a real build.  Unfortunately, the only way to get
 good performance results is to have a parallel build done with those off, in
 addition to the debug/assert one used to catch bugs.

Right, which is part of the reason I noted it was a cassert build.

 The above pgbench is executing less than 600 actual tests (60 seconds @
 9.7TPS).  That seems a bit short to me.  If you sorted out the above and run
 this again, it would be good to let pgbench run for a lot longer than 1
 minute, to see if the results show some more significant difference. With
 this few TPS, it would be nice to let that run for 30 minutes or more if you
 can find some time to schedule that.

Ok thats useful to know as well, thanks! (ill go re-run them)

 --
 * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Alex Hunsaker
On Mon, Dec 8, 2008 at 23:28, ITAGAKI Takahiro
[EMAIL PROTECTED] wrote:

 Alex Hunsaker [EMAIL PROTECTED] wrote:

 I was assigned to review this.

 Thanks for your reviewing.

 I assume that the basic concepts are ok and focus of discussion is in:
  - New counters in struct Instrumentation.
(buffer usage and CPU usage)
  - Should EXPLAIN ANALYZE show those counters.

Right, I would split out your next patch in 3 parts: the hooks you
need, contrib module and the new counters.  I think I saw older
versions of the patch that did this... just got lost for this version?


 Performance review
 HEAD: tps = 9.674423
 PATCH: tps = 9.695784

 If it claims to improve performance, does it?
 Does it slow down other things?

 The patch should not slow down normal use if you don't use
 pg_stat_statements module, but it might slow down EXPLAIN ANALYZE
 because some fields are added in struct Instrumentation and
 they are counted up per tuple in EXPLAIN ANALYZE.

Err yes sorry I was just following
http://wiki.postgresql.org/wiki/Reviewing_a_Patch, those two did not
seem pertainant so I did not answer them.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Greg Stark
Yes this is one reasonable option, as is the idea of using XML or a  
table and making it the client's problem. Neither are going to happen  
for this release I think.


And in any case it will always be useful to have an option to print  
all the available information anyways so we make as well do that with  
verbose.



--
Greg


On 9 Dec 2008, at 16:35, Robert Haas [EMAIL PROTECTED] wrote:

As stuff matures and becomes indispensable we could consider moving  
it to the
regular EXPLAIN or implement some way to specify precisely which  
data the user
wants. Or just say XML/table data/whatever will solve the problem  
for us.


I think some way to specify precisely which data the user wants is the
way to go.  The amount of data that there is to be printed is only
going to continue to increase.  If the only toggle is a boolean flag
to display ALL or NONE of it, then every time someone proposes a new
type of output, we're going to argue about whether it's useful enough
to be worth the display real estate.

I'm not sure what the best way is though.  I don't think continuing to
add keywords between EXPLAIN and the start of the query is very
scalable.  Putting parentheses around the option list seems like it
might eliminate a lot of grammar headaches:

EXPLAIN (option, option, option...) SELECT ...

...Robert


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Vladimir Sitnikov
On Tue, Dec 9, 2008 at 8:53 PM, Robert Haas [EMAIL PROTECTED] wrote:

 On Tue, Dec 9, 2008 at 12:44 PM, Greg Stark [EMAIL PROTECTED]
 wrote:
  Yes this is one reasonable option, as is the idea of using XML or a table
  and making it the client's problem. Neither are going to happen for this
  release I think.

 Agreed.

I 100% agree with that point. Thus I suggest output additional information
into explain analyze since:
 1) it will require minimal code change
 2) it will be consistent with previous behaviour
 3) looks like a natural EXPLAIN's feature improvement
 4) will be anyway changed when table for explain will come


  And in any case it will always be useful to have an option to print all
 the
  available information anyways so we make as well do that with verbose.

 Sounds very nice.

Can I ask my question once again?
Why don't you want to make print all the info the default output format?
As long as it comes to pgsql-performance, they used to recommend: please,
provide EXPLAIN ANALYZE, and not just EXPLAIN.
If the default output format is not changed in 8.4, this will transform into
please, provide EXPLAIN ANALYZE VERBOSE, not just EXPLAIN ANALYZE or
EXPLAIN. Do you really want that?


Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Robert Haas
On Tue, Dec 9, 2008 at 12:44 PM, Greg Stark [EMAIL PROTECTED] wrote:
 Yes this is one reasonable option, as is the idea of using XML or a table
 and making it the client's problem. Neither are going to happen for this
 release I think.

Agreed.

 And in any case it will always be useful to have an option to print all the
 available information anyways so we make as well do that with verbose.

Sounds very nice.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Alex Hunsaker
On Sun, Dec 7, 2008 at 19:13, Alex Hunsaker [EMAIL PROTECTED] wrote:
 On Tue, Dec 2, 2008 at 02:35, ITAGAKI Takahiro
 [EMAIL PROTECTED] wrote:
 Here is an update version of contrib/pg_stat_statements.

 Hello again!

 I was assigned to review this.

... Some other things I accidentally left out.

#define GUCNAME(name)   (statistics. name)

Why statistics?
Would not something like stat_statements make more sense?  Statistics
seems fairly arbitrary...

Also per the
/* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */

Maybe it should be configurable, personally I would want something
like # of calls / time.  Mainly because I don't for instance really
care that my backups get tracked but would be more interested in the
things that get called most often that also take the longest.  (aka
the most bang for the buck, as far as optimizing those goes...)

?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread ITAGAKI Takahiro

Tom Lane [EMAIL PROTECTED] wrote:

 Please split this into two separate patches that can be separately
 evaluated.

Sure. I want to disucuss only where to add counters of buffer usage
and cpu usage, or they should not be added. However, it seems to
affect future of EXPLAIN ANALYZE, so we might also need to discuss
about EXPLAIN.

I assume we have 3 choices here:

1. Add those counters to struct Instrument.
We can get statistics for each line in EXPLAIN ANALYZE,
but it might have overhead to update counters.

2. Add those counters only to top instruments (one per query).
We can get accumulated statistics for each query.
It might be unsufficient for complex queries.

3. Should not add any counters.
No changes to core, but usability of pg_stat_statement module
would be very poor...

Which should we take? or are there another idea?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread ITAGAKI Takahiro

Alex Hunsaker [EMAIL PROTECTED] wrote:

 #define GUCNAME(name) (statistics. name)
 
 Why statistics?
 Would not something like stat_statements make more sense?  Statistics
 seems fairly arbitrary...

Not to use duplicated statements words;
variable names contains statements already.
- stat_statements.max_statements
- stat_statements.track_statements
seem to be ugly for me, but avoiding arbitrariness might be more
important. If there are agreements, I will to change the prefix.


 Also per the
 /* XXX: Should USAGE_EXEC reflect execution time and/or buffer usage? */
 
 Maybe it should be configurable, personally I would want something
 like # of calls / time.  Mainly because I don't for instance really
 care that my backups get tracked but would be more interested in the
 things that get called most often that also take the longest.  (aka
 the most bang for the buck, as far as optimizing those goes...)

Configurability is better, but we need documentations of how to
configure them and I have no clear idea for it. Also, we already have
means for logging slow queries. We could use the logging for slow
queries executed rarely and use the module queries executed many times.

Excluding backup scripts is easy; You can create a database role for
backup and disable statement-tracking for the user using ALTER ROLE.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-08 Thread ITAGAKI Takahiro

Alex Hunsaker [EMAIL PROTECTED] wrote:

 I was assigned to review this.

Thanks for your reviewing.
I assume that the basic concepts are ok and focus of discussion is in:
  - New counters in struct Instrumentation.
(buffer usage and CPU usage)
  - Should EXPLAIN ANALYZE show those counters.

 Performance review
 HEAD: tps = 9.674423
 PATCH: tps = 9.695784
 
 If it claims to improve performance, does it?
 Does it slow down other things?

The patch should not slow down normal use if you don't use
pg_stat_statements module, but it might slow down EXPLAIN ANALYZE
because some fields are added in struct Instrumentation and
they are counted up per tuple in EXPLAIN ANALYZE.

 Also find attached some very minor verbiage changes.

Thanks. I'll apply your fixes.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-08 Thread ITAGAKI Takahiro

Vladimir Sitnikov [EMAIL PROTECTED] wrote:

  2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.
 
 I do not get the point of VERBOSE.
 As far as I understand, explain analyze (without verbose) will anyway add
 overhead for calculation of gets/hits/cpu. Why discard that information in
 non verbose mode? Just to make the investigation more complex?

I thought that output of new counters are too wide and it brakes
compatibility of EXPLAIN ANALYZE. On the other hand, we don't have to
think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
added in 8.4. However, overheads should be avoided. We could have
two kinds of instrumentations, time-only or all-stats.

Another idea is to have buffer and cpu counters not in Instrumentation
but in QueryDesc (i.e, only one per query). A new field for the counter
will be initialized in pg_stat_statements module. EXPLAIN ANALYZE could
also use the field but the result is printed only the end of query plan,
as the follows. We can avoid too-wide-line problem with the approach.

=# EXPLAIN ANALYZE SELECT ...;
QUERY PLAN
---
 ...
 Total runtime: 24.273 ms
 CPU usage: user=20.2 sys=2.4   # 
 Buffer usage: gets=100 reads=10 writes=20  # added lines
 Tempfile i/o: reads=50 writes=50   # 
(10 rows)


 I wish there was a way to get the results of explain into some table. I wish
 it was the default output format. That would make life of pgAdmin easier,
 and improve readability even in psql. Do not you think there is something
 wrong with having cost=... rows=... loops=... in each and every row?

I absolutely agree that XML-explain and TABLE-expalin are useful,
but they are independent features from pg_stat_statements. So I'd
like to discuss them in separated threads (and it will come in 8.5).

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-08 Thread Vladimir Sitnikov


   2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.
  
  I do not get the point of VERBOSE.
  As far as I understand, explain analyze (without verbose) will anyway add
  overhead for calculation of gets/hits/cpu. Why discard that information
 in
  non verbose mode? Just to make the investigation more complex?

 I thought that output of new counters are too wide and it brakes
 compatibility of EXPLAIN ANALYZE. On the other hand, we don't have to
 think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
 added in 8.4. However, overheads should be avoided. We could have
 two kinds of instrumentations, time-only or all-stats.

I am not sure if this impact (in terms of compatibility) is really that big.
As far as I could understand, pgAdmin parses modified explain analyze output
well. However, pgAdmin does not support verbose.

The impact in terms of computational overhead could be measured. I guess
it depends only on the number of calls to the instrumentation (that is
either rows or loops). Thus, after explaining some basic select from
generate_series, one could adjust the timings. We could even incorporate
this to EXPLAIN ANALYZE, so it would display cpu is 0.5s, while
instrumentation cpu is 0.3s.


Another idea is to have buffer and cpu counters not in Instrumentation
 but in QueryDesc (i.e, only one per query). A new field for the counter
 will be initialized in pg_stat_statements module. EXPLAIN ANALYZE could
 also use the field but the result is printed only the end of query plan,
 as the follows. We can avoid too-wide-line problem with the approach.

Single number per query is sufficient only for pg_stat_statements purposes.
That will give an insight of what the top consuming queries are (by cpu
time, by gets, etc).
However, single gets=... reads=... is not sufficient to pinpoint the
problem especially in case of complex query (that is comparable to query
returned N rows vs this plan node returned N rows) .



 =# EXPLAIN ANALYZE SELECT ...;
QUERY PLAN
 ---
  ...
  Total runtime: 24.273 ms
  CPU usage: user=20.2 sys=2.4   #
  Buffer usage: gets=100 reads=10 writes=20  # added lines
  Tempfile i/o: reads=50 writes=50   #
 (10 rows)

I wish pgAdmin (or whatever client) had an option to fetch that counters for
each and every SQL query and display the consumed resources at a separate
tab. I mean, even before/after plain select (without any explain). That
will show you how the query would behave without any instrumentation.

Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-07 Thread Alex Hunsaker
On Tue, Dec 2, 2008 at 02:35, ITAGAKI Takahiro
[EMAIL PROTECTED] wrote:
 Here is an update version of contrib/pg_stat_statements.

Hello again!

I was assigned to review this.

Submission review:
Is the patch in standard form? Yes
Does it apply cleanly to current HEAD? Yes (with fuzz)
Does it include reasonable tests, docs, etc? Yes

Usability review:
Does the patch actually implement that? Yes
Do we want that? I think so
Do we already have it? No
Does it follow SQL spec, or the community-agreed behavior? Sure
Are there dangers? No
Have all the bases been covered? Yes

Feature test:
Does the feature work as advertised? Yes
Are there corner cases the author has failed to consider? No

Performance review
Does the patch slow down simple tests?

Does not seem to...

(test.sql)
select * from tenk1 a join tenk1 b using (unique1);

(dual core machine, --enable-debug, --enable-cassert build)
pgbench -c 2 -T60 -n -f test.sql

HEAD: tps = 9.674423
PATCH: tps = 9.695784

If it claims to improve performance, does it?
Does it slow down other things?

Coding review:
Does it follow the project coding guidelines? Yes
Are there portability issues? No
Will it work on Windows/BSD etc? Think so
Are the comments sufficient and accurate? I think so
Does it do what it says, correctly? Yes
Does it produce compiler warnings? No
Can you make it crash? No

I'm not sure about the new counters in struct Instrumentation or the
hooks (but did not see anything obviously wrong with them)... A
commiter can better comment on those.  Also find attached some very
minor verbiage changes.  If there is nothing else on your todo list
for this Ill mark it as Ready for commiter on the wiki.
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
***
*** 216,222  error:
  }
  
  /*
!  * pgss_shutdown - Load statistics from file.
   */
  static void
  pgss_startup(void)
--- 216,222 
  }
  
  /*
!  * pgss_startup - Load statistics from file.
   */
  static void
  pgss_startup(void)
*** a/doc/src/sgml/pgstatstatements.sgml
--- b/doc/src/sgml/pgstatstatements.sgml
***
*** 68,74 
entrystructfieldcalls/structfield/entry
entrytypebigint/type/entry
entry/entry
!   entryNumber of being executed/entry
   /row
  
   row
--- 68,74 
entrystructfieldcalls/structfield/entry
entrytypebigint/type/entry
entry/entry
!   entryNumber of times executed/entry
   /row
  
   row

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-05 Thread Greg Smith

On Fri, 5 Dec 2008, Vladimir Sitnikov wrote:


Oracle's approach is to have the explain command stuff the results into a
table. That has advantages for tools, especially if you want to be able to
look at plans generated by other sessions.


I do not believe that workflow makes sense. I have never ever thought of it.


The main benefit is that you can track how EXPLAIN plans change over time. 
Archive a snapshot of what the plans for all your common queries look like 
every day, and the day one of them blows up and starts doing something 
wrong you've got a *lot* more information to work with for figuring out 
what happened--whether it was a minor query change, some stats that got 
slowly out of whack, etc.  I wouldn't just immediately dismiss that 
workflow as unsensible without thinking about it a bit first, there are 
some good advantages to it.


Greg Sabino Mullane had a neat blog entry on saving plans to tables in 
PostgreSQL, unfortunately the Planet PostgreSQL outage seems to have eaten 
it.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-05 Thread Vladimir Sitnikov

 The main benefit is that you can track how EXPLAIN plans change over time.

It is not required to output plan *into* some table to be able track it over
time. If EXPLAIN returns a table, it is up to you to perform insert into
history select * from explain(...).

Workflow that does not make sense for me is look at plans generated _into
some plan_table_ by other sessions in Oracle.
I am 100% sure it really makes sense have some view like pg_execute_plan
that will reveal execution plans for currently running queries (see
v$sql_plan in Oracle). However, I would stress once again I have no idea
what the sense could be in one session explained into plan_table, while the
other reads that plan.

Does that make sense?

Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-04 Thread Vladimir Sitnikov
 2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.

I do not get the point of VERBOSE.
As far as I understand, explain analyze (without verbose) will anyway add
overhead for calculation of gets/hits/cpu. Why discard that information in
non verbose mode? Just to make the investigation more complex?

Write-counters are not included because I think they are not so useful.

Never say never. I guess they (or just one counter for accumulated writes)
could be useful for monitoring operations that spill to the disk. For
instance, PostgreSQL does not show the amount of temp used for the join.



buffer_gets;/* # of buffer hits */
buffer_hits;/* # of buffer gets */
buffile_reads;  /* # of buffile reads */

I guess it makes sense expanding buffile reads into buffer file reads or
just file reads

Here is an sample output. We'd better to add a linebreak before
 the 'actual' section because the line is too wide to display.

I wish there was a way to get the results of explain into some table. I wish
it was the default output format. That would make life of pgAdmin easier,
and improve readability even in psql. Do not you think there is something
wrong with having cost=... rows=... loops=... in each and every row?



 ResetBufferUsage() is save the current counters in global variables as
 baseline and buffer statistics are measured in difference from them
 because the counters are used by struct Instrumentation.

That would definitely work well for Instrumentation (it will not notice
resetBufferUsage any more), however that will not isolate the guys who do
the reset. I am afraid the idea of having api for reset is broken and it
makes sense removing that function. However it looks like it is out of scope
of this patch.

Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-04 Thread Gregory Stark
Vladimir Sitnikov [EMAIL PROTECTED] writes:

 I wish there was a way to get the results of explain into some table. I wish
 it was the default output format. That would make life of pgAdmin easier,
 and improve readability even in psql. Do not you think there is something
 wrong with having cost=... rows=... loops=... in each and every row?

A number of people have suggesting we switch to XML.

An alternative would be to build up a tuplestore of data and then send that to
the client in a separate result set. That's kind of nice because it would give
us a way to send both the real results and the explain results. And at least
we already have an api for accessing result sets.

Oracle's approach is to have the explain command stuff the results into a
table. That has advantages for tools, especially if you want to be able to
look at plans generated by other sessions. But it's pretty awkward for the
simple case.

I'm sure there are dozens of ways to skin this cat. Anyone have any more? 
We probably just have to pick one and run with it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-04 Thread Vladimir Sitnikov

 Vladimir Sitnikov [EMAIL PROTECTED] writes:

  I wish there was a way to get the results of explain into some table. I
 wish
  it was the default output format. That would make life of pgAdmin
 easier,
  and improve readability even in psql. Do not you think there is something
  wrong with having cost=... rows=... loops=... in each and every row?

 A number of people have suggesting we switch to XML.

I do not see much benefit of XML:
 * XML is not human-readable
 * Plain old result set is even easier to process since it is the main
PostgreSQL interface at this point

The only benefit of XML I could imagine is it could provide a nicer markup
for sort/hash/etc nodes. It is not that nice to have a column sort method
that would be empty nearly for all the rows. At the same time it looks fine
to have a column with xml inside for any additional information execution
node wants provide (like topN-allrows sort / number of batches in hash join
or whatever)



 An alternative would be to build up a tuplestore of data and then send that
 to
 the client in a separate result set. That's kind of nice because it would
 give
 us a way to send both the real results and the explain results. And at
 least
 we already have an api for accessing result sets.

Sounds good. As for me, current output of explain is not very easy to read:
it suits well only for find timings for particular node workflow only (I
mean, the source is a particular node, the result is
timings/rows/buffers/etc). However from my point of view, when it comes to
query tuning the main workflow is find node by suspicious timings. If all
the relevant data were displayed in the same column it would be easier to
read. Consider all the row counts in the very first column.



 Oracle's approach is to have the explain command stuff the results into a
 table. That has advantages for tools, especially if you want to be able to
 look at plans generated by other sessions.

I do not believe that workflow makes sense. I have never ever thought of it.

External table makes sense if you have several output formats (say, create a
formatting function for psql and let pgAdmin format the plan on its own)

Regards,
Vladimir Sitnikov


[HACKERS] contrib/pg_stat_statements 1202

2008-12-02 Thread ITAGAKI Takahiro
Here is an update version of contrib/pg_stat_statements.

New modifications from the last version are:
1. New counters in struct Instrumentation.
2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.
3. Buffer statistics counsters are not reset to zero anymore.


1. New counters in struct Instrumentation.
[in include/executor/instrument.h, backend/executor/instrument.c]

The following fields are added. They are used by pg_stat_statements
and EXPLAIN ANALYZE VERBOSE. getrusage() is called for each nodes.
Write-counters are not included because I think they are not so useful.

buffer_gets;/* # of buffer hits */
buffer_hits;/* # of buffer gets */
buffile_reads;  /* # of buffile reads */
utime;  /* user time in sec */
stime;  /* sys time in sec */


2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.
[in backend/commands/explain.c]

I borrowed the idea from Vladimir,
Buffer pool statistics in Explain Analyze.
http://archives.postgresql.org/message-id/[EMAIL PROTECTED]

Here is an sample output. We'd better to add a linebreak before
the 'actual' section because the line is too wide to display.

=# EXPLAIN ANALYZE VERBOSE SELECT * FROM accounts;
  
QUERY PLAN
---
 Seq Scan on accounts  (cost=0.00..2688.29 rows=101829 width=97) (actual 
time=0.072..119.633 rows=10 loops=1 gets=1670 reads=1638 local_reads=0 
CPU=0.06/0.03 sec)
   Output: aid, bid, abalance, filler
 Total runtime: 209.556 ms
(3 rows)


3. Buffer statistics counsters are not reset to zero anymore.
[in storage/buffer/bufmgr.c]

ResetBufferUsage() is save the current counters in global variables as
baseline and buffer statistics are measured in difference from them
because the counters are used by struct Instrumentation.



Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


pg_stat_statements.1202.tar.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers