Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Itagaki Takahiro

Tom Lane t...@sss.pgh.pa.us wrote:

 2. I do not understand the stuff with propagating counts into the top
 instrumentation node.  That seems like it's going to double-count those
 counts.  In any case it is 100% inconsistent to propagate only buffer
 counts that way and not any other resource usage.  I think you should
 drop the TopInstrument variable and the logic that propagates counts up.

It is required by contrib/pg_stat_statements. EXPLAIN wants per-node
accumulation, but pg_stat_statements wants the total number.

Is it enough to add a PG_TRY block to standard_ExecutorRun() to
cleanup TopInstrument on error? I'm working on your other comments,
but I cannot remove TopInstrument for pg_state_statements.

I considerd other approaches, but all of them require node-dependent
routines; for example, adding a function to walk through a plan tree
and accumulate instrumentations in it at pg_stat_statements. But it is
hard to be maintained on executor nodes changes. Are there any better idea?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 2. I do not understand the stuff with propagating counts into the top
 instrumentation node.

 It is required by contrib/pg_stat_statements. EXPLAIN wants per-node
 accumulation, but pg_stat_statements wants the total number.

Well, you need to find another way or risk getting the patch rejected
altogether.  Those global variables are the weakest part of the whole
design, and I'm not going to commit a patch that destabilizes the entire
system for the sake of a debatable requirement of a contrib module.

If you went with the alternative definition I suggested (don't reset the
static counters, so that every node includes its children's counts) then
the behavior you want would fall out automatically.  Or, at the price of
running both resettable and non-resettable static counters, you could
have pg_stat_statements obtain totals that are independent of any
particular instrumentation node.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Robert Haas
On Wed, Oct 14, 2009 at 9:56 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 2. I do not understand the stuff with propagating counts into the top
 instrumentation node.

 It is required by contrib/pg_stat_statements. EXPLAIN wants per-node
 accumulation, but pg_stat_statements wants the total number.

 Well, you need to find another way or risk getting the patch rejected
 altogether.  Those global variables are the weakest part of the whole
 design, and I'm not going to commit a patch that destabilizes the entire
 system for the sake of a debatable requirement of a contrib module.

 If you went with the alternative definition I suggested (don't reset the
 static counters, so that every node includes its children's counts) then
 the behavior you want would fall out automatically.  Or, at the price of
 running both resettable and non-resettable static counters, you could
 have pg_stat_statements obtain totals that are independent of any
 particular instrumentation node.

I am marking this patch as Returned with Feedback.  I hope that it
will be resubmitted for a future CommitFest, because I think this
could be pretty interesting feature.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Itagaki Takahiro

Robert Haas robertmh...@gmail.com wrote:

  Well, you need to find another way or risk getting the patch rejected
  altogether. ?Those global variables are the weakest part of the whole
  design, and I'm not going to commit a patch that destabilizes the entire
  system for the sake of a debatable requirement of a contrib module.
 
 I am marking this patch as Returned with Feedback.  I hope that it
 will be resubmitted for a future CommitFest, because I think this
 could be pretty interesting feature.

Ok, I'll reconsider them and re-submit patches for the next commitfest.
Maybe I need to split the patch into EXPLAIN-part and contrib-part.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Robert Haas
2009/10/14 Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp:

 Robert Haas robertmh...@gmail.com wrote:

  Well, you need to find another way or risk getting the patch rejected
  altogether. ?Those global variables are the weakest part of the whole
  design, and I'm not going to commit a patch that destabilizes the entire
  system for the sake of a debatable requirement of a contrib module.

 I am marking this patch as Returned with Feedback.  I hope that it
 will be resubmitted for a future CommitFest, because I think this
 could be pretty interesting feature.

 Ok, I'll reconsider them and re-submit patches for the next commitfest.
 Maybe I need to split the patch into EXPLAIN-part and contrib-part.

My (limited) experience is that it's usually better to get something
incremental committed, even if it's not what you really want.  You can
always take another crack at the remaining issues later, but if the
whole patch gets shot down then you are out of luck.

In this case, I think that the auto_explain changes out to be part of
the same patch as the core EXPLAIN changes, but if the
pg_stat_statement stuff is severable it might make sense to push that
off until later.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Itagaki Takahiro

Robert Haas robertmh...@gmail.com wrote:

 My (limited) experience is that it's usually better to get something
 incremental committed, even if it's not what you really want.  You can
 always take another crack at the remaining issues later, but if the
 whole patch gets shot down then you are out of luck.

Yeah, that makes sense. But the partial change should also be
a long-term solution ;-). It is hard to determine whether
the partial change is a good solution until the whole features
works as expected (at least partially).

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-14 Thread Robert Haas
On Wed, Oct 14, 2009 at 9:38 PM, Itagaki Takahiro
itagaki.takah...@oss.ntt.co.jp wrote:

 Robert Haas robertmh...@gmail.com wrote:

 My (limited) experience is that it's usually better to get something
 incremental committed, even if it's not what you really want.  You can
 always take another crack at the remaining issues later, but if the
 whole patch gets shot down then you are out of luck.

 Yeah, that makes sense. But the partial change should also be
 a long-term solution ;-). It is hard to determine whether
 the partial change is a good solution until the whole features
 works as expected (at least partially).

Well, that's an indication that you've chosen too small a piece.  But
I don't really believe that a change that affects only core EXPLAIN
and auto_explain is too small a piece to be independently useful.  If
it is, the whole feature is probably badly conceived in the first
place...

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-13 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp writes:
 Here is an update version of buffer usage patch.

I started to look at this patch, and I have a few comments:

1. I was expecting this patch to get rid of ShowBufferUsage() and friends
altogether, instead of adding yet more static counters to support them.
Isn't that stuff pretty well superseded by having EXPLAIN support?

2. I do not understand the stuff with propagating counts into the top
instrumentation node.  That seems like it's going to double-count those
counts.  In any case it is 100% inconsistent to propagate only buffer
counts that way and not any other resource usage.  I think you should
drop the TopInstrument variable and the logic that propagates counts up.

3. I don't believe that you've sufficiently considered the problem of
restoring the previous value of CurrentInstrument after an error.  It is
not at all adequate to do it in postgres.c; consider subtransactions
for example.  However, so far as I can see that variable is useless
anyway.  Couldn't you just drop both that and the prev link?
(If you keep TopInstrument then the same objection applies to it.)

4. I don't believe this counting scheme works, except in the special
case where all buffer access happens in leaf plan nodes (which might be
enough if it weren't for Sort, Materialize, Hash, etc).  It looks to
me like counts will be transferred into the instrumentation node for
the next plan node to stop execution, which could be a descendant of
the node that really ought to get charged.

You could deal with #4 by having the low-level I/O routines accumulate
counts directly into *CurrentInstrument and not have static I/O counters
at all, but then you'd have to contend with fixing #3 properly instead
of just eliminating that global variable.  It might be better to add a
start field to struct Instrumentation for each counter, and do
something like this:
* StartNode copies static counter into start field
* StopNode computes delta = static counter - start field,
  then adds delta to node's count and resets counter to start
The reason for the reset is so that the I/O isn't double counted by
parent nodes.  If you wanted buffer I/O to be charged to the node
causing it *and* to all parent nodes, which would be more consistent
with the way we charge CPU time, then don't do the reset.  Offhand
though that seems to me like it'd be more surprising than useful.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-04 Thread Robert Haas
On Wed, Sep 30, 2009 at 10:40 PM, Itagaki Takahiro
itagaki.takah...@oss.ntt.co.jp wrote:

 Euler Taveira de Oliveira eu...@timbira.com wrote:

  But there are some confusions in postgres; ReadBufferCount and
  BufferHitCount are used for get and hit, but heap_blks_read
  and heap_blks_hit are used for read and hit in pg_statio_all_tables.
 I see. :(

 I fixed the confusions of get, hit and read in your patch.
    long        num_hit = ReadBufferCount + ReadLocalBufferCount;
    long        num_read = num_hit - BufferHitCount - LocalBufferHitCount;
 should be
    long        num_get = ReadBufferCount + ReadLocalBufferCount;
    long        num_hit = BufferHitCount + LocalBufferHitCount;
    long        num_read = num_get - num_hit;

 ReadBufferCount means number of buffer access :(

 Patch attached.

I took a look at this today and I have a couple of comments.  The
basic functionality looks useful, but I think the terminology is too
terse.  Specific commens:

1. In the EXPLAIN output, I think that the buffers information should
be output on its own line, rather than appended to the line that
already contains costs and execution times.  The current output
doesn't include the word buffers or blocks anywhere, which seems
to me to be a critical flaw.  I would suggest something like Blocks
Read: %ld  Hit:  %ld  Temp Read: %ld\n.  See the way we handle output
of sort type and space usage, for example.

2. Similarly, in pg_stat_statements, the Counters structure could
easily use the same names for the structure members that we already
use in e.g. pg_stat_database - blks_hit, blks_read, and, say,
blks_temp_read.  In fact I tend to think we should stick with blocks
rather than buffers overall, for consistency with what the system
does elsewhere.

3. With respect to the doc changes in explain.sgml, we consistently
use disk rather than disc in the documentation; but it may not be
necessary to use that word at all, and I think the paragraph can  be
tightened up a bit: Include information on the number of blocks read,
the number of those that are hits (already in shared buffers and do
not need to be read in), and the number of those that are reads on
temporary, backend-local buffers.  This parameter requires that the
literalANALYZE/literal parameter also be used.  This parameter
defaults to literalFALSE/literal.

4. Instrumentation stack is broken doesn't seem terribly helpful in
understanding what has gone wrong.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-04 Thread Itagaki Takahiro

Robert Haas robertmh...@gmail.com wrote:

 1. I would suggest something like Blocks
 Read: %ld  Hit:  %ld  Temp Read: %ld\n.  See the way we handle output
 of sort type and space usage, for example.

I have some questions:
  * Did you use single space and double spaces in your example intentionally?
  * Should we use lower cases here?
  * Can I use temp instead of Temp Read to shorten the name?

 2. Similarly, in pg_stat_statements, the Counters structure could
 easily use the same names for the structure members that we already
 use in e.g. pg_stat_database - blks_hit, blks_read, and, say,
 blks_temp_read.  In fact I tend to think we should stick with blocks
 rather than buffers overall, for consistency with what the system
 does elsewhere.

I agree to rename them into blks_*, but EXPLAIN (blocks) might be
misleading; EXPLAIN (buffer) can be interpreted as buffer usage,
but normally we don't call it block usage.

My suggestion is:
* EXPLAIN (buffers) prints (blocks read: %ld hit: %ld temp: %ld)
* auto_explain.log_buffers are not changed
* pg_stat_statements uses blks_hit and blks_read

 4. Instrumentation stack is broken doesn't seem terribly helpful in
 understanding what has gone wrong.

This message is only for hackers and should not occur.
Assert() might be ok instead.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-04 Thread Robert Haas
On Sun, Oct 4, 2009 at 11:22 PM, Itagaki Takahiro
itagaki.takah...@oss.ntt.co.jp wrote:
 Robert Haas robertmh...@gmail.com wrote:

 1. I would suggest something like Blocks
 Read: %ld  Hit:  %ld  Temp Read: %ld\n.  See the way we handle output
 of sort type and space usage, for example.

 I have some questions:
  * Did you use single space and double spaces in your example intentionally?

No, that was unintentional.

  * Should we use lower cases here?

No.  We don't anywhere else in explain.c.

  * Can I use temp instead of Temp Read to shorten the name?

I can't tell what that means without reading the source code.  I think
clarity should take precedence over brevity.

 2. Similarly, in pg_stat_statements, the Counters structure could
 easily use the same names for the structure members that we already
 use in e.g. pg_stat_database - blks_hit, blks_read, and, say,
 blks_temp_read.  In fact I tend to think we should stick with blocks
 rather than buffers overall, for consistency with what the system
 does elsewhere.

 I agree to rename them into blks_*, but EXPLAIN (blocks) might be
 misleading; EXPLAIN (buffer) can be interpreted as buffer usage,
 but normally we don't call it block usage.

 My suggestion is:
    * EXPLAIN (buffers) prints (blocks read: %ld hit: %ld temp: %ld)
    * auto_explain.log_buffers are not changed
    * pg_stat_statements uses blks_hit and blks_read

I agree.

 4. Instrumentation stack is broken doesn't seem terribly helpful in
 understanding what has gone wrong.

 This message is only for hackers and should not occur.
 Assert() might be ok instead.

Hmm, I think I like the idea of an Assert().  Logging a cryptic
message at DEBUG2 doesn't seem sufficient for a can't-happen condition
that probably indicates a serious bug in the code.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-04 Thread Itagaki Takahiro
Here is an update version of buffer usage patch.
  * All buffers_* and bufs_* are renamed to blks_*.
  * 'disc' = 'disk' in documentation
  * Replace debug-log to Assert().
  * Fix a bug in ResetLocalBufferUsage(). log_xxx_stats had not worked.

Robert Haas robertmh...@gmail.com wrote:

  ?* Can I use temp instead of Temp Read to shorten the name?
 I can't tell what that means without reading the source code.  I think
 clarity should take precedence over brevity.

I used temp_blks_read because we have idx_blks_read in pg_statio_xxx.


=# \d pg_stat_statements
   View public.pg_stat_statements
 Column |   Type   | Modifiers
+--+---
 userid | oid  |
 dbid   | oid  |
 query  | text |
 calls  | bigint   |
 total_time | double precision |
 rows   | bigint   |
 blks_hit   | bigint   |
 blks_read  | bigint   |
 temp_blks_read | bigint   |

=# SET work_mem = '1MB';
=# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pgbench_accounts ORDER BY bid;
  QUERY PLAN
---
 Sort  (cost=21913.32..22163.33 rows=15 width=97) (actual 
time=81.345..99.054 rows=10 loops=1)
   Sort Key: bid
   Sort Method:  external sort  Disk: 10472kB
   Blocks Hit: 0  Read: 0  Temp Read: 1309
   -  Seq Scan on pgbench_accounts  (cost=0.00..2667.05 rows=15 width=97) 
(actual time=0.018..23.129 rows=10 loops=1)
 Blocks Hit: 74  Read: 1694  Temp Read: 0
 Total runtime: 105.238 ms
(7 rows)

=# SET work_mem = '18MB';
=# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pgbench_accounts ORDER BY bid;
  QUERY PLAN
---
 Sort  (cost=10972.32..11222.33 rows=15 width=97) (actual 
time=35.437..43.069 rows=10 loops=1)
   Sort Key: bid
   Sort Method:  quicksort  Memory: 17916kB
   Blocks Hit: 0  Read: 0  Temp Read: 0
   -  Seq Scan on pgbench_accounts  (cost=0.00..2667.05 rows=15 width=97) 
(actual time=0.028..15.030 rows=10 loops=1)
 Blocks Hit: 32  Read: 1635  Temp Read: 0
 Total runtime: 52.026 ms
(7 rows)


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



buffer_usage_20091005.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-01 Thread Euler Taveira de Oliveira
Itagaki Takahiro escreveu:
 I fixed the confusions of get, hit and read in your patch.
 
Works for me. Will mark it ready for a committer.

PS BTW, your patch (20091001112006.9c36.52131...@oss.ntt.co.jp) doesn't seem
to be in archives.p.o. though I've received a copy from the server.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-01 Thread Alvaro Herrera
Euler Taveira de Oliveira wrote:
 Itagaki Takahiro escreveu:
  I fixed the confusions of get, hit and read in your patch.
  
 Works for me. Will mark it ready for a committer.
 
 PS BTW, your patch (20091001112006.9c36.52131...@oss.ntt.co.jp) doesn't seem
 to be in archives.p.o. though I've received a copy from the server.

That's indeed very strange -- I have it locally and I wasn't CCed, so
Majordomo must have delivered it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-10-01 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Euler Taveira de Oliveira wrote:
  Itagaki Takahiro escreveu:
   I fixed the confusions of get, hit and read in your patch.
   
  Works for me. Will mark it ready for a committer.
  
  PS BTW, your patch (20091001112006.9c36.52131...@oss.ntt.co.jp) doesn't 
  seem
  to be in archives.p.o. though I've received a copy from the server.
 
 That's indeed very strange -- I have it locally and I wasn't CCed, so
 Majordomo must have delivered it.

Something was wrong with last month's archive.  For some reason it had
96000 files in that directory.  I have rerun mhonarc on it and it has
normalized now (~2100 files).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-09-30 Thread Itagaki Takahiro

Euler Taveira de Oliveira eu...@timbira.com wrote:

  But there are some confusions in postgres; ReadBufferCount and
  BufferHitCount are used for get and hit, but heap_blks_read
  and heap_blks_hit are used for read and hit in pg_statio_all_tables.
 I see. :(

I fixed the confusions of get, hit and read in your patch.
longnum_hit = ReadBufferCount + ReadLocalBufferCount;
longnum_read = num_hit - BufferHitCount - LocalBufferHitCount;
should be
longnum_get = ReadBufferCount + ReadLocalBufferCount;
longnum_hit = BufferHitCount + LocalBufferHitCount;
longnum_read = num_get - num_hit;

ReadBufferCount means number of buffer access :(

Patch attached.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



buffer_usage_20091001.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-09-28 Thread Euler Taveira de Oliveira
Hi Itagaki-san,

I'm reviewing your patch. Your patch is in good shape. It applies cleanly. All
of the things are built as intended (including the two contrib modules). It
doesn't include docs but I wrote it. Basically, I produced another patch (that
are attached) correcting some minor gripes; docs are included too. The
comments are in-line.

   static bool auto_explain_log_analyze = false;
   static bool auto_explain_log_verbose = false;
 + static bool auto_explain_log_buffer = false;
Rename it to auto_explain_log_buffers. That's because I renamed the option for
plural form. See above.

   es.verbose = auto_explain_log_verbose;
 + es.buffer = auto_explain_log_buffer;
Change this check to look at es.analyze too. So the es.buffers will only be
enabled iif the es.analyze is enabled too.


 + /* Build a tuple descriptor for our result type */
 + if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
 + elog(ERROR, return type must be a row type);
 + 
   per_query_ctx = rsinfo-econtext-ecxt_per_query_memory;
   oldcontext = MemoryContextSwitchTo(per_query_ctx);
   
 ! tupdesc = CreateTupleDescCopy(tupdesc);
   
Out of curiosity, any reason for this change?

   else if (strcmp(opt-defname, costs) == 0)
   es.costs = defGetBoolean(opt);
 + else if (strcmp(opt-defname, buffer) == 0)
 + es.buffer = defGetBoolean(opt);
I decided to change buffer to buffers. That's because we already have
costs and the statistics will not be about one kind of buffer so plural form
sounds more natural.

 + if (es-format == EXPLAIN_FORMAT_TEXT)
 + {
 + appendStringInfo(es-str,  (gets=%ld reads=%ld 
 temp=%ld),
 + num_gets, num_reads, num_temp);
Rename gets and reads to hit and read. Maybe we could prefix it with
buf_ or something else.

Rename num_gets and num_reads to num_hit and num_read. The later
terminology is used all over the code.

 + }
 + else
 + {
 + ExplainPropertyLong(Buffer Gets, num_gets, 
 es);
 + ExplainPropertyLong(Buffer Reads, num_reads, 
 es);
 + ExplainPropertyLong(Buffer Temp, num_temp, 
 es);
I didn't like these terminologies. I came up with Hit Buffers, Read
Buffers, and Temp Buffers. I confess that I don't like the last ones.
Read Buffers? We're reading from disk blocks. Read Blocks? Read Disk
Blocks? Read Data Blocks?
Temp Buffers? It could be temporary sort files, hash files (?), or temporary
relations. Hit Local Buffers? Local Buffers? Hit Temp Buffers?

   #include parser/parsetree.h
 + #include storage/buf_internals.h
It's not used. Removed.

 + CurrentInstrument = instr-prev;
 + }
 + else
 + elog(WARNING, Instrumentation stack is broken);
WARNING? I changed it to DEBUG2 and return immediately (as it does some lines
of code above).

 + /* for log_[parser|planner|executor|statement]_stats */
 + static long GlobalReadBufferCount;
 + static long GlobalReadLocalBufferCount;
 + static long GlobalBufferHitCount;
 + static long GlobalLocalBufferHitCount;
 + static long GlobalBufferFlushCount;
 + static long GlobalLocalBufferFlushCount;
 + static long GlobalBufFileReadCount;
 + static long GlobalBufFileWriteCount;
 + 
I'm not sure if this is the right place for these counters. Maybe we should
put it in buf_init.c. Ideas?

   boolcosts;  /* print costs */
 + boolbuffer; /* print buffer stats */
Rename it to buffers.

 + /* Buffer usage */
 + longbuffer_gets;/* # of buffer reads */
 + longbuffer_reads;   /* # of disk reads */
 + longbuffer_temp;/* # of temp file reads */
Rename them to buffers_hit, buffers_read, and buffers_temp.

I didn't test this changes with big queries because I don't have some at
hand right now. Also, I didn't notice any slowdowns caused by the patch.
Comments? If none, it is ready for a committer.


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/



buffer_usage-20090928.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-09-28 Thread Itagaki Takahiro

Euler Taveira de Oliveira eu...@timbira.com wrote:

 I'm reviewing your patch. Your patch is in good shape. It applies cleanly. All
 of the things are built as intended (including the two contrib modules). It
 doesn't include docs but I wrote it. Basically, I produced another patch (that
 are attached) correcting some minor gripes; docs are included too. The
 comments are in-line.

Thanks. Except choice of words, can I think the basic architecture of
the patch is ok? The codes to handle global variables like ReadBufferCount
and GlobalReadBufferCount could be rewrite in cleaner way if we could
drop supports of log_{parser|planner|executor|statement}_stats.

  +   /* Build a tuple descriptor for our result type */
  +   if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
  +   elog(ERROR, return type must be a row type);
  + 
  per_query_ctx = rsinfo-econtext-ecxt_per_query_memory;
  oldcontext = MemoryContextSwitchTo(per_query_ctx);

  !   tupdesc = CreateTupleDescCopy(tupdesc);

 Out of curiosity, any reason for this change?

That's because the new code is cleaner, I think. Since the result tuple
type is defined in OUT parameters, we don't have to re-define the result
with CreateTemplateTupleDesc().

  +   appendStringInfo(es-str,  (gets=%ld reads=%ld 
  temp=%ld),
  +   num_gets, num_reads, num_temp);
 Rename gets and reads to hit and read. Maybe we could prefix it with
 buf_ or something else.
 
 Rename num_gets and num_reads to num_hit and num_read. The later
 terminology is used all over the code.

We should define the meanings of get and hit before rename them.
I'd like to propose the meanings as following:
* get  : number of page access (= hit + read)
* hit  : number of cache read (no disk read)
* read : number of disk read (= number of read() calls)

But there are some confusions in postgres; ReadBufferCount and
BufferHitCount are used for get and hit, but heap_blks_read
and heap_blks_hit are used for read and hit in pg_statio_all_tables.
Can I rename ReadBufferCount to BufferGetCount? And which values should
we show in EXPLAIN and pg_stat_statements? (two of the three are enough)

 I didn't like these terminologies. I came up with Hit Buffers, Read
 Buffers, and Temp Buffers. I confess that I don't like the last ones.
 Read Buffers? We're reading from disk blocks. Read Blocks? Read Disk
 Blocks? Read Data Blocks?
 Temp Buffers? It could be temporary sort files, hash files (?), or temporary
 relations. Hit Local Buffers? Local Buffers? Hit Temp Buffers?

I borrowed the terms Buffer Gets and Buffer Reads from STATSPACK report
in Oracle Database. But I'm willing to rename them if appropriate.
http://www.oracle.com/apps_benchmark/doc/awrrpt_20090325b_900.html#600

Presently Temp Buffers contains temporary sort files, hash files, and
materialized executor plan. Local buffer statistics for temp tables are
not included here but merged with shared buffer statistics. Are there
any better way to group them?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-09-28 Thread Euler Taveira de Oliveira
Itagaki Takahiro escreveu:
 Thanks. Except choice of words, can I think the basic architecture of
 the patch is ok? The codes to handle global variables like ReadBufferCount
 and GlobalReadBufferCount could be rewrite in cleaner way if we could
 drop supports of log_{parser|planner|executor|statement}_stats.
 
Yes, it is. I'm afraid someone is relying on that piece of code. We can ask
people if it is ok to deprecated it; but it should be removed after 2 releases
or so. BTW, Isn't it make sense to move the Global* variables to buf_init.c,
is it?

 We should define the meanings of get and hit before rename them.
 I'd like to propose the meanings as following:
 * get  : number of page access (= hit + read)
 * hit  : number of cache read (no disk read)
 * read : number of disk read (= number of read() calls)
 
+1.

 But there are some confusions in postgres; ReadBufferCount and
 BufferHitCount are used for get and hit, but heap_blks_read
 and heap_blks_hit are used for read and hit in pg_statio_all_tables.
I see. :(

 Can I rename ReadBufferCount to BufferGetCount? And which values should
 we show in EXPLAIN and pg_stat_statements? (two of the three are enough)
 
Do you want to include number of page access in the stats? If not, we don't
need to rename the variables for now (maybe a separate patch?). And IMHO we
should include hit and read because get is just a simple math.

 I borrowed the terms Buffer Gets and Buffer Reads from STATSPACK report
 in Oracle Database. But I'm willing to rename them if appropriate.
 http://www.oracle.com/apps_benchmark/doc/awrrpt_20090325b_900.html#600
 
Hmm... I don't have a strong opinion about those names as I said. So if you
want to revert the old names...

 Presently Temp Buffers contains temporary sort files, hash files, and
 materialized executor plan. Local buffer statistics for temp tables are
 not included here but merged with shared buffer statistics. Are there
 any better way to group them?
 
Are you sure? Looking at ReadBuffer_common() function I see an 'if
(isLocalBuf)' test.

As I said your patch is in good shape and if we solve those terminologies, it
is done for a committer.

Would you care to submit another patch if you want to do some modifications?


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements (review)

2009-09-28 Thread Tom Lane
Euler Taveira de Oliveira eu...@timbira.com writes:
 Itagaki Takahiro escreveu:
 Thanks. Except choice of words, can I think the basic architecture of
 the patch is ok? The codes to handle global variables like ReadBufferCount
 and GlobalReadBufferCount could be rewrite in cleaner way if we could
 drop supports of log_{parser|planner|executor|statement}_stats.
 
 Yes, it is. I'm afraid someone is relying on that piece of code.

If we have a better substitute, I think there'd be nothing wrong with
removing those features.  They were never anything but pretty crufty
anyway, and they are *not* a compatibility issue because applications
have no direct way to access those stats.  However, you'd have to be
sure that the substitute covers all the use-cases for the old stats
... which strikes me as a lot more territory than this patch has
proposed to cover.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Buffer usage in EXPLAIN and pg_stat_statements

2009-08-17 Thread Itagaki Takahiro
Here is a proposal to add buffer usage information to EXPLAIN and
contrib/pg_stat_statements. We can retrieve new values 'gets',
'reads' and 'temp':

- gets  : total number of buffer pool access
- reads : total number of data file access
- temp  : total number of temp file access (sort)

In EXPLAIN, we can use EXPLAIN (ANALYZE, BUFFER) ... syntax.
Each executor node shows buffer usage only in it; parent nodes
don't contain buffer usages in their sub nodes.

In pg_stat_statements, new 3 columns are added to the view.
We can determine queries that consume I/O bandwidth using
SELECT * FROM pg_stat_statements ORDER BY reads DESC.
We will find out bad queries easily using those buffer and
disk access information in addition to duration statistics.


I implementd this feature using an instrumentation stack. A global
variable CurrentInstrument points top of the stack, and each
Instrumentation are linked with newly added 'prev' field.
The stack must be reset even on error because each innstrumentation
might have been deallocated already. I added codes to reset stack
in main loop of backend for the purpose.

TopInstrument is a special node that sums up all of the child nodes.
It tracks QueryDesc.totaltime and used in pg_stat_statements. There
might be another idea that walking around on planstate tree and
gathering all counters in the module, but very complex codes are
needed. I chose a simple way.


I'll write documentations if this design is accepted.
Comments welcome.


Output samples are below:

[EXPLAIN]
=# EXPLAIN (ANALYZE, BUFFER) SELECT * FROM pgbench_accounts ORDER BY bid;
 QUERY PLAN
--
 Sort  (cost=...) (actual ...) (gets=0 reads=0 temp=1309)
   Sort Key: bid
   Sort Method:  external sort  Disk: 10472kB
   -  Seq Scan on pgbench_accounts  (cost=...) (actual ...) (gets=1798 
reads=1644 temp=0)
 Total runtime: 75.867 ms

[contrib/pg_stat_statements]
=# SELECT query, gets, reads FROM pg_stat_statements ORDER BY gets DESC LIMIT 4;
query | gets  | 
reads
--+---+---
 UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2; | 58628 | 
1
 UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2; | 26999 | 
 1929
 UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;  | 25474 | 
1
 SELECT abalance FROM pgbench_accounts WHERE aid = $1;| 19950 | 
0
(4 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


buffer_usage-20090817.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer usage in EXPLAIN and pg_stat_statements

2009-08-17 Thread Pavel Stehule
2009/8/17 Itagaki Takahiro itagaki.takah...@oss.ntt.co.jp:
 Here is a proposal to add buffer usage information to EXPLAIN and
 contrib/pg_stat_statements. We can retrieve new values 'gets',
 'reads' and 'temp':

    - gets  : total number of buffer pool access
    - reads : total number of data file access
    - temp  : total number of temp file access (sort)

 In EXPLAIN, we can use EXPLAIN (ANALYZE, BUFFER) ... syntax.
 Each executor node shows buffer usage only in it; parent nodes
 don't contain buffer usages in their sub nodes.

 In pg_stat_statements, new 3 columns are added to the view.
 We can determine queries that consume I/O bandwidth using
 SELECT * FROM pg_stat_statements ORDER BY reads DESC.
 We will find out bad queries easily using those buffer and
 disk access information in addition to duration statistics.


 I implementd this feature using an instrumentation stack. A global
 variable CurrentInstrument points top of the stack, and each
 Instrumentation are linked with newly added 'prev' field.
 The stack must be reset even on error because each innstrumentation
 might have been deallocated already. I added codes to reset stack
 in main loop of backend for the purpose.

 TopInstrument is a special node that sums up all of the child nodes.
 It tracks QueryDesc.totaltime and used in pg_stat_statements. There
 might be another idea that walking around on planstate tree and
 gathering all counters in the module, but very complex codes are
 needed. I chose a simple way.


 I'll write documentations if this design is accepted.
 Comments welcome.


 Output samples are below:

 [EXPLAIN]
 =# EXPLAIN (ANALYZE, BUFFER) SELECT * FROM pgbench_accounts ORDER BY bid;
                                 QUERY PLAN
 --
  Sort  (cost=...) (actual ...) (gets=0 reads=0 temp=1309)
   Sort Key: bid
   Sort Method:  external sort  Disk: 10472kB
   -  Seq Scan on pgbench_accounts  (cost=...) (actual ...) (gets=1798 
 reads=1644 temp=0)
  Total runtime: 75.867 ms

 [contrib/pg_stat_statements]
 =# SELECT query, gets, reads FROM pg_stat_statements ORDER BY gets DESC LIMIT 
 4;
                                query                                 | gets  
 | reads
 --+---+---
  UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2; | 58628 
 |     1
  UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2; | 26999 
 |  1929
  UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;  | 25474 
 |     1
  SELECT abalance FROM pgbench_accounts WHERE aid = $1;                | 19950 
 |     0
 (4 rows)


It's should be nice

Pavel

 Regards,
 ---
 ITAGAKI Takahiro
 NTT Open Source Software Center


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers