Re: [HACKERS] Planner debug views

2015-08-03 Thread Alvaro Herrera
Qingqing Zhou wrote:
 On Thu, Jul 30, 2015 at 2:42 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 
  I think a better option would be shoving it into a backend tuplestore and
  just leaving it there (maybe with a command to clear it for the paranoid).
  That gives a relation you can query against, insert into another table, etc.
 
 This is something I don't know how to do it: in my understanding, a
 tuplestore is an internal store, which means it has no name exposed.
 Thus we can't reference it later.

Yeah, that doesn't sound the kind of project you should attempt here.
EXPLAIN already knows to return tuples, so I was assuming you would
return your stuff using that mechanism.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-31 Thread Qingqing Zhou
On Thu, Jul 30, 2015 at 2:42 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 I think a better option would be shoving it into a backend tuplestore and
 just leaving it there (maybe with a command to clear it for the paranoid).
 That gives a relation you can query against, insert into another table, etc.
 --

This is something I don't know how to do it: in my understanding, a
tuplestore is an internal store, which means it has no name exposed.
Thus we can't reference it later.

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-30 Thread Jim Nasby

On 7/29/15 2:40 PM, Alvaro Herrera wrote:

Qingqing Zhou wrote:


Can we simplify above with foreign table methods? There are two major
concerns about this method per previous discussions: security and
usability. I think the main cause is the sharing foreign table design.


I think foreign data wrappers are great.  I do not think that we should
try to shape every problem to look like foreign data so that we can
solve it with a foreign data wrapper.  I am a bit nervous that this
keeps being brought up.


Agreed.

I think a better option would be shoving it into a backend tuplestore 
and just leaving it there (maybe with a command to clear it for the 
paranoid). That gives a relation you can query against, insert into 
another table, etc.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-29 Thread Qingqing Zhou
On Tue, Jul 28, 2015 at 6:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Another point is that we decided a long time ago that EXPLAIN's plain-text
 output format is not intended to be machine-parsable, and so objecting to
 a design on the grounds that it makes machine parsing harder is pretty
 wrongheaded.  I'd think there is plenty of room for dropping in additional
 output data in the non-text output formats.

I think this will work, for example, I can put several sections of the
JSON output:

{
plan: {
// original EXPLAIN plan tree sits here
},
paths:{
// paths considered sits here
}
// ...
}

But still, it requires an extra step for user: he will needs to
programming to read through output (easier) and persists into a table for
later query.

Can we simplify above with foreign table methods? There are two major
concerns about this method per previous discussions: security and
usability. I think the main cause is the sharing foreign table design. How
about we put foreign table in separate pg_stat_tmp/pid folders, similar
to what Alvaro proposes, and similar to /proc file system. Meanwhile, we
introduce a function to help user create foreign table mapping to these
files. This looks solves the security and usability issues to me:

postgres=# select pg_debug_planner_init();
Foreign table 'pg_planner_rels', 'pg_planner_paths' created.
postgres=# EXPLAIN (debug_planner=on, ...) ...
...
postgres=# select * from pg_planner_paths;
...

Thoughts?

Qngqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-29 Thread Alvaro Herrera
Qingqing Zhou wrote:

 Can we simplify above with foreign table methods? There are two major
 concerns about this method per previous discussions: security and
 usability. I think the main cause is the sharing foreign table design.

I think foreign data wrappers are great.  I do not think that we should
try to shape every problem to look like foreign data so that we can
solve it with a foreign data wrapper.  I am a bit nervous that this
keeps being brought up.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Tom Lane
Qingqing Zhou zhouqq.postg...@gmail.com writes:
 There are still something bothering me: EXPLAIN is a mixed output with
 original text, rows for RelOptInfo, rows for Paths and possible others
 added later. So we have to use 't as text' to receive each line. To do the
 insertion, we have to further decompose each text line into fields, and
 then do the insertion - seems quite involved with plpgsql programming.

Well, that's only true if we don't expend some effort to make it better.

I could imagine, for instance, that if you specify the EXPLAIN option that
turns on this additional output, the output is no longer just a single
text column but is multiple columns.  Perhaps one column could be a
key indicating what the other column(s) contain.

Another point is that we decided a long time ago that EXPLAIN's plain-text
output format is not intended to be machine-parsable, and so objecting to
a design on the grounds that it makes machine parsing harder is pretty
wrongheaded.  I'd think there is plenty of room for dropping in additional
output data in the non-text output formats.  That needs some careful
document-schema design effort, for sure, but it doesn't seem like it would
be particularly hard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Qingqing Zhou
On Tue, Jul 28, 2015 at 2:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 You can do something like that in plpgsql, for example

 declare t text;

 for t in EXPLAIN SELECT ...
 loop
insert into whatever values(t);
 end loop;


I see - this is cool.

There are still something bothering me: EXPLAIN is a mixed output with
original text, rows for RelOptInfo, rows for Paths and possible others
added later. So we have to use 't as text' to receive each line. To do the
insertion, we have to further decompose each text line into fields, and
then do the insertion - seems quite involved with plpgsql programming. So
to simplify user's task, we may end up introduce some function to do this,
like this:

/* EXPLAIN target query and dump records to target tables */
select pg_dump_searchspace('target_table_for_rel',
   'target_table_for_paths', 'select ... /* target query */');

Is this something we want?

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Alvaro Herrera
Qingqing Zhou wrote:

 Attached is a draft patch implementing the idea. To play with it, you
 shall create the follow two foreign tables:
 CREATE EXTENSION file_fdw;
 CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;
 create foreign table pg_planner_rels(rel text, content text)server
 pglog options(filename 'your_install/data/debug_planner_relopt.csv',
 format 'csv');
 create foreign table pg_planner_paths(rel text, path text, replacedby
 text, reason int, startupcost float, totalcost float, cheapest text,
 innerp text, outerp text, content text) server pglog options(filename
 'your_install/data/debug_planner_paths.csv', format 'csv');

I think this is a pretty neat idea, but I'm not sure this user interface
is a good one.  Why not have a new option for EXPLAIN, so you would call
EXPLAIN (planner_stuff=on) and it returns this as a resultset?  This
idea of creating random CSV files seems odd and inconvenient in the long
run.  For instance it fails if you have two sessions doing it
simultaneously; you could tack the process ID at the end of the file
name to prevent that problem, but then the foreign table breaks each
time.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Alvaro Herrera
Qingqing Zhou wrote:
 On Mon, Jul 27, 2015 at 8:20 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 
  I think this is a pretty neat idea, but I'm not sure this user interface
  is a good one.  Why not have a new option for EXPLAIN, so you would call
  EXPLAIN (planner_stuff=on) and it returns this as a resultset?
 
 Thank you for the feedback.
 
 Yeah, I agree piggy back on EXPLAIN sounds a better interface. A good
 thing about GUC is that it is global, so deep in planner we can see it.

Um, okay, I gather that GUC is convenient to use for this purpose.  I
don't see it as a good choice; I think a bare separate global variable
at the C level is more appropriate.

  This idea of creating random CSV files seems odd and inconvenient in the
  long run.  For instance it fails if you have two sessions doing it
  simultaneously; you could tack the process ID at the end of the file
  name to prevent that problem, but then the foreign table breaks each
  time.
 
 The reason to use CSV file is a kinda of balance. We do have other
 options, like pass data to pgstat, or persist in some shared memory/heap,
 but they all have their own issues. Any suggestion here?

I would have a tuplestore, and the planner code would push tuples to it.
After the planning is done, EXPLAIN can read and return tuples from the
store to the user.

 The file name is not random, it is fixed so we can create foreign table
 once and use it afterwards - I actually want to push them into
 system_views.sql.

Got that.  That seems fragile and not very convenient; I don't think
forcing retries until no concurrent writers were using the same file is
convenient at all.  When you need this facility the most, which is
during slow planner runs, it is more likely that somebody else will
overwrite your file.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Qingqing Zhou
On Mon, Jul 27, 2015 at 8:20 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

 I think this is a pretty neat idea, but I'm not sure this user interface
 is a good one.  Why not have a new option for EXPLAIN, so you would call
 EXPLAIN (planner_stuff=on) and it returns this as a resultset?

Thank you for the feedback.

Yeah, I agree piggy back on EXPLAIN sounds a better interface. A good
thing about GUC is that it is global, so deep in planner we can see it.
For example, in add_path(), where we add tracking of discarded paths. If
we do EXPLAIN, we may either have to borrow another global variable, or
add a flag on several planner data structures to make sure the flag can
penetrate deep.

Another thing is that with a GUC, we can mark it internal (PGC_INTERNAL),
which compatibility maintenance might be relaxed, especially for this
advanced option.

 This idea of creating random CSV files seems odd and inconvenient in the
 long run.  For instance it fails if you have two sessions doing it
 simultaneously; you could tack the process ID at the end of the file
 name to prevent that problem, but then the foreign table breaks each
 time.

The reason to use CSV file is a kinda of balance. We do have other
options, like pass data to pgstat, or persist in some shared memory/heap,
but they all have their own issues. Any suggestion here?

The file name is not random, it is fixed so we can create foreign table
once and use it afterwards - I actually want to push them into
system_views.sql. The file is opened with O_APPEND, the same way as log
files, so concurrent writes are serialized. Read could be problematic
though as no atomic guarantee between read/write. This is however a
general issue of file_fdw, as the file is out of control of the core. We
shall expect query returning format errors with concurrent read/write, and
retry shall resolve the issue.

Thanks,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Qingqing Zhou wrote:
 The file name is not random, it is fixed so we can create foreign table
 once and use it afterwards - I actually want to push them into
 system_views.sql.

 Got that.  That seems fragile and not very convenient; I don't think
 forcing retries until no concurrent writers were using the same file is
 convenient at all.  When you need this facility the most, which is
 during slow planner runs, it is more likely that somebody else will
 overwrite your file.

FWIW, I would be very much against anything that requires going through
the filesystem for this.  That will create security/privilege issues that
we should not want to introduce, quite aside from the usability problems
Alvaro points out.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Qingqing Zhou
On Tue, Jul 28, 2015 at 12:08 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I would have a tuplestore, and the planner code would push tuples to it.
 After the planning is done, EXPLAIN can read and return tuples from the
 store to the user.


Not sure if I got it: so EXPLAIN will return tuples to libpq client. But
how do we store these returned tuples (RelOptInfo, Path etc) so we can
throw queries against them later?

Something like this:
INSERT INTO my_space SELECT (EXPLAIN SELECT ...); -- won't get parsed

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-28 Thread Tom Lane
Qingqing Zhou zhouqq.postg...@gmail.com writes:
 Not sure if I got it: so EXPLAIN will return tuples to libpq client. But
 how do we store these returned tuples (RelOptInfo, Path etc) so we can
 throw queries against them later?

 Something like this:
 INSERT INTO my_space SELECT (EXPLAIN SELECT ...); -- won't get parsed

You can do something like that in plpgsql, for example

declare t text;

for t in EXPLAIN SELECT ...
loop
   insert into whatever values(t);
end loop;

There's an example of doing this sort of thing in the brin.sql regression
test, where it's used to verify that we're getting the plan type we
expect.

I don't feel a strong need to invent additional forms of that wheel.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-27 Thread Qingqing Zhou
On Thu, Jul 23, 2015 at 4:11 PM, Tatsuo Ishii is...@postgresql.org wrote:
 Sounds like a great feature!


Thanks!

Attached is a draft patch implementing the idea. To play with it, you
shall create the follow two foreign tables:
CREATE EXTENSION file_fdw;
CREATE SERVER pglog FOREIGN DATA WRAPPER file_fdw;
create foreign table pg_planner_rels(rel text, content text)server
pglog options(filename 'your_install/data/debug_planner_relopt.csv',
format 'csv');
create foreign table pg_planner_paths(rel text, path text, replacedby
text, reason int, startupcost float, totalcost float, cheapest text,
innerp text, outerp text, content text) server pglog options(filename
'your_install/data/debug_planner_paths.csv', format 'csv');
Example output attached.

Questions:
1. Which document shall we update? This is more than existing
debug_print_ knobs.
2. GEQO is not supported yet. I would suggest we do that with a
separate check in.
3. Where do we want to put the csv files? Currently I just put them under /data.
4. Do we want to push these two foreign tables into system_view.sql?
One problem is that foreign table needs a absolute path. Any way to
handle this?
5. As the output is csv file: I wrap strings with '' but not sure
within the string itself if there any. Do we have any guarantee here?

Thanks,
Qingqing

---

postgres=# select p.rel, p.path, p.replacedby, p.reason,
p.startupcost, p.totalcost, p.cheapest, p.innerp, p.outerp,
substr(p.content, 1,30),r.content from pg_planner_paths p join
pg_planner_rels r on p.rel=r.rel;
rel|   path| replacedby | reason | startupcost | totalcost
|   cheapest   |  innerp   |  outerp   | substr
 |content
---+---+++-+---+--+---+---++
 0x2791a10 | 0x279d4b0 |||   0 |  40.1
| +total+startup+param |   |   | ForeignScan(1)
rows=301 cost=0 | RELOPTINFO (1): rows=301 width=244
 0x279f998 | 0x27a2238 |||   0 |   1.1
| +total+startup+param |   |   | ForeignScan(1) rows=1
cost=0.0 | RELOPTINFO (1): rows=1 width=244
 0x279fbd0 | 0x27a28b8 |||   0 |   1.1
| +total+startup+param |   |   | ForeignScan(2) rows=1
cost=0.0 | RELOPTINFO (2): rows=1 width=64
 0x27a2ab0 | 0x27a3c68 |||   0 |  2.21
| +total+startup+param | 0x27a28b8 | 0x27a2238 | NestLoop(1 2) rows=1
cost=0.00 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a4608 | 0x27a4608  |  2 |1.11 |  2.23
|  | 0x27a2238 | 0x27a28b8 | HashJoin(1 2) rows=1
cost=1.11 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a4498 | 0x27a4498  |  0 |   0 |  2.22
|  | 0x27a4330 | 0x27a28b8 | NestLoop(1 2) rows=1
cost=0.00 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a4388 | 0x27a4388  |  0 |   0 |  2.21
|  | 0x27a2238 | 0x27a28b8 | NestLoop(1 2) rows=1
cost=0.00 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a4220 | 0x27a4220  |  2 |2.22 |  2.25
|  | 0x27a2238 | 0x27a28b8 | MergeJoin(1 2) rows=1
cost=2.2 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a3f90 | 0x27a3f90  |  2 |1.11 |  2.23
|  | 0x27a28b8 | 0x27a2238 | HashJoin(1 2) rows=1
cost=1.11 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a3e20 | 0x27a3e20  |  0 |   0 |  2.22
|  | 0x27a3c10 | 0x27a2238 | NestLoop(1 2) rows=1
cost=0.00 | RELOPTINFO (1 2): rows=1 width=308
 0x27a2ab0 | 0x27a3b18 | 0x27a3c68  |  1 |2.22 |  2.25
|  | 0x27a28b8 | 0x27a2238 | MergeJoin(1 2) rows=1
cost=2.2 | RELOPTINFO (1 2): rows=1 width=308


0002-local-change.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-23 Thread Tatsuo Ishii
 Here is a proposal introducing some debugging infrastructure into the
 core. The basic idea is to allow us to query the planner search space.
 To so do, we can dump related information to csv files and use foreign
 table to query them. So here I propose two foreign tables:

Sounds like a great feature!

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planner debug views

2015-07-23 Thread Qingqing Zhou
On Tue, Jul 21, 2015 at 5:15 PM, Qingqing Zhou
zhouqq.postg...@gmail.com wrote:
 Here is a proposal introducing some debugging infrastructure into the
 core. The basic idea is to allow us to query the planner search space.
 To so do, we can dump related information to csv files and use foreign
 table to query them.

FWIW, this is to tackle(+) optimizer TODO item:

https://wiki.postgresql.org/wiki/Todo#Optimizer_.2F_Executor
Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

Regards,
Qingqing


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers