Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-06-19 Thread Tom Raney
I have been working on a project (for GSOC) to retrieve 
planner/optimizer details.  As part of the project, I need machine 
parsable output.  So, I thought I would dust off a patch I found from 
last year that Germán Caamaño submitted.  I didn't see any further 
activity there so I integrated it into 8.4 and added a DTD.


The output below is generated by using the added flag 'XML' to the 
EXPLAIN command.  The DTD probably wouldn't be needed for every output 
instance and may need its own flag.


I am coming up to speed on the planner internals, but it seems like this 
first EXPLAIN XML concept may have some use.  Are there any strong 
opinions about the XML hierarchy?  Is it enough to simply wrap the text 
output from EXPLAIN with XML tags?


-Tom Raney



QUERY PLAN
---
?xml version=1.0?

!DOCTYPE explain
[
!ELEMENT explain (plan+) 
!ELEMENT plan (table?, cost, qualifier?) 
!ELEMENT table EMPTY 
!ELEMENT cost EMPTY 
!ELEMENT qualifier EMPTY 
!ATTLIST explain
   version CDATA  #REQUIRED 
!ATTLIST plan
   name CDATA #REQUIRED
   level CDATA#REQUIRED 
!ATTLIST cost
   startup CDATA  #REQUIRED
   total CDATA#REQUIRED
   rows CDATA #REQUIRED
   width CDATA#REQUIRED 
!ATTLIST table
   name CDATA #REQUIRED 
!ATTLIST qualifier
   type CDATA #REQUIRED
   value CDATA #REQUIRED 
]

explain version=8.4devel
plan name=Seq Scan level=0
  table name=tenk1/
  cost startup=0.00 total=445.00 rows=1 width=244 /
/plan
/explain
(32 rows)



Greg Smith wrote:

On Thu, 17 Apr 2008, Tom Lane wrote:


For debugging the planner work I'm about to do, I'm expecting it will be
useful to be able to get EXPLAIN to print the targetlist of each plan
node, not just the quals (conditions) as it's historically done.


I've heard that some of the academic users of PostgreSQL were hoping 
to add features in this area in order to allow better using planner 
internals for educational purposes.  It would be nice if that were 
available for such purposes without having to recompile.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 For debugging the planner work I'm about to do, I'm expecting it will be
 useful to be able to get EXPLAIN to print the targetlist of each plan
 node, not just the quals (conditions) as it's historically done.  My
 first instinct is just to stick in the code under a debugging #ifdef,
 but I wonder if anyone wants to argue for making it more easily
 available?

Yes please.

 I think it'd be a mistake to turn it on by default, because it'd add a
 line for every plan node, which'd be an awful lot of bloat in output
 that's hard enough to read already.  And experience has shown that
 99.99% of the time people don't need the info.  Still, there's that
 other 0.01%.

 I'm tempted to propose redefining the currently-nearly-useless
 EXPLAIN VERBOSE option as doing this.

EXPLAIN VERBOSE is indeed ridiculous. The only downside is that people
following modern instructions on old installs will be sad. But I'm fine with
that.

IMHO You could also move width to VERBOSE while you're at it. In fact you'll
probably want width in precisely the same cases where you want the target
list.

I think down the road we'll have a few different independent data sets you can
get out of explain or at least explain analyze. I want to get i/o stats in
there which I think you'll want to turn on and off as a group, for example.
But perhaps by the time we do that someone will have done XML explain and
it'll be irrelevant. I can't think of any nice syntax to do that offhand
anyways. So +1 for just redefining VERBOSE.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 EXPLAIN VERBOSE is indeed ridiculous.

There are other ways to get that printout, too, if you really do need
it.

 IMHO You could also move width to VERBOSE while you're at it. In fact you'll
 probably want width in precisely the same cases where you want the target
 list.

I'm not convinced.  The width is often useful to understand why the
planner did something (eg, chose a hash plan or not).  The exact
contents of the targetlist are usually not nearly as interesting.

 So +1 for just redefining VERBOSE.

Barring other objections I'll go do that.


BTW, while testing the code I already found a bug:

regression=# set enable_hashagg to 0;
SET
regression=# explain select thousand from tenk1 group by 1;
 QUERY 
PLAN  
-
 Group  (cost=1122.39..1172.39 rows=998 width=4)
   Output: thousand
   -  Sort  (cost=1122.39..1147.39 rows=1 width=4)
 Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, 
twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
 Sort Key: thousand
 -  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=4)
   Output: unique1, unique2, two, four, ten, twenty, hundred, 
thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, 
string4
(7 rows)

Only the thousand column is needed, so why is it emitting all columns?
It's evidently allowing the use physical tlist optimization to fire,
which saves cycles inside the SeqScan node --- but in this context
that's penny-wise and pound-foolish, because we're pumping useless data
through the Sort.  There is code in the planner that's supposed to
notice the needs of the next level up, but it's not getting this case
right for some reason...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 I'm not convinced.  The width is often useful to understand why the
 planner did something (eg, chose a hash plan or not).  The exact
 contents of the targetlist are usually not nearly as interesting.

I've never seen a single post on any of the lists where anyone went through
that exercise though.

  -  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=4)
Output: unique1, unique2, two, four, ten, twenty, hundred, 
 thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, 
 string4

I wonder if I even understand what width means. Or does the planner think most
of these columns are mostly null?

Or is it estimating the width based on the belief that only the thousand
column is actually going to be emitted?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Or is it estimating the width based on the belief that only the thousand
 column is actually going to be emitted?

Right.  The width is used to estimate how much space would be needed
for, eg, sorting or hashing the plan node's output.  In any case where
something like that is actually happening, we *should* be emitting only
the required columns, so I didn't see any particular need to make
use_physical_tlist change the reported width.  OTOH this bug shows that
maybe that was hiding useful information ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Simon Riggs
On Thu, 2008-04-17 at 12:34 -0400, Tom Lane wrote:

 I'm tempted to propose redefining the currently-nearly-useless
 EXPLAIN VERBOSE option as doing this.

Yes please.

Sounds like a good home for other useful things also.

I'd like to have an EXPLAIN mode that displayed the plan without *any*
changeable info (i.e. no costs, row counts etc). This would then allow
more easy determination of whether plans had changed over time. (But
EXPLAIN TERSE sounds silly).

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Greg Smith

On Thu, 17 Apr 2008, Tom Lane wrote:


For debugging the planner work I'm about to do, I'm expecting it will be
useful to be able to get EXPLAIN to print the targetlist of each plan
node, not just the quals (conditions) as it's historically done.


I've heard that some of the academic users of PostgreSQL were hoping to 
add features in this area in order to allow better using planner internals 
for educational purposes.  It would be nice if that were available for 
such purposes without having to recompile.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread PFC
On Thu, 17 Apr 2008 20:42:49 +0200, Simon Riggs [EMAIL PROTECTED]  
wrote:



On Thu, 2008-04-17 at 12:34 -0400, Tom Lane wrote:


I'm tempted to propose redefining the currently-nearly-useless
EXPLAIN VERBOSE option as doing this.


Yes please.

Sounds like a good home for other useful things also.

I'd like to have an EXPLAIN mode that displayed the plan without *any*
changeable info (i.e. no costs, row counts etc). This would then allow
more easy determination of whether plans had changed over time. (But
EXPLAIN TERSE sounds silly).



Plan = Tree
Tree = XML

EXPLAIN ANALYZE SELECT * FROM test NATURAL JOIN test2 WHERE id  
=ANY('{3,666,975,521'});

QUERY PLAN
---
 Nested Loop  (cost=17.04..65.13 rows=1 width=8) (actual  
time=51.835..51.835 rows=0 loops=1)

   Join Filter: (test.value = test2.value)
   -  Bitmap Heap Scan on test  (cost=17.04..31.96 rows=4 width=8)  
(actual time=16.622..16.631 rows=4 loops=1)

 Recheck Cond: (id = ANY ('{3,666,975,521}'::integer[]))
 -  Bitmap Index Scan on test_pkey  (cost=0.00..17.04 rows=4  
width=0) (actual time=16.613..16.613 rows=4 loops=1)

   Index Cond: (id = ANY ('{3,666,975,521}'::integer[]))
   -  Index Scan using test2_pkey on test2  (cost=0.00..8.28 rows=1  
width=8) (actual time=8.794..8.795 rows=1 loops=4)

 Index Cond: (test2.id = test.id)

EXPLAIN XML ...

NestedLoop
Join Filter=(test.value = test2.value)
BitmapHeapScan Target=test RecheckCond=(id) = ANY ($1) /
BitmapIndexScan Index=test_pkey Cond=id = ANY 
('$1'::integer[])) /
/Join
IndexScan Index=test2_pkey Target=test2 Cond=test2.id = test.id 
/
/NestedLoop

Nicely parsable and displayable in all its glory in pgadmin ;)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan targetlists in EXPLAIN output

2008-04-17 Thread Alvaro Herrera
PFC wrote:

   Plan = Tree
   Tree = XML

If you want to propose a DTD I'm sure there would be many people
interested.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers