[HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Pavel Stehule
Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well


postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z  where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
 pg_pretty_query
--
  SELECT x.a, z.a+
FROM foo, foo x, x z +
   WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
   WHERE foo.a = z.a))
(1 row)

Regards

Pavel


pg_pretty_query.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Thom Brown
On 7 August 2012 15:14, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 last year we are spoke about reusing pretty print view code for some queries.

 Here is patch:

 this patch is really short - it is nice. But - it works only with
 known database objects (probably we would it) and it doesn't format
 subqueries well


 postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
 z  where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
 z.a)', true, false);
  pg_pretty_query
 --
   SELECT x.a, z.a+
 FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
 FROM foo +
WHERE foo.a = z.a))
 (1 row)

This looks odd:

postgres=# SELECT pg_pretty_query('SELECT 1, (SELECT max(a.x) +
greatest(2,3) FROM generate_series(4,10,2) a(x)) FROM
generate_series(1,100) GROUP BY 1 ORDER BY 1, 2 USING  NULLS FIRST',
true, false);
 pg_pretty_query
--
  SELECT 1,  +
 ( SELECT max(a.x) + GREATEST(2, 3)  +
FROM generate_series(4, 10, 2) a(x)) +
FROM generate_series(1, 100) generate_series(generate_series)+
   GROUP BY 1::integer   +
   ORDER BY 1::integer, ( SELECT max(a.x) + GREATEST(2, 3)   +
FROM generate_series(4, 10, 2) a(x)) NULLS FIRST
(1 row)

USING  is removed completely (or if I used DESC, NULLS FIRST is then
removed instead), 2 in the order by is expanded to its full query,
and generate_series when used in FROM is repeated with its own name as
a parameter.  I'm also not sure about the spacing before each line.
SELECT, FROM and GROUP BY all appear out of alignment from one
another.

Plus it would be nice if we could support something like the following style:

SELECT
field_one,
field_two + field_three
FROM
my_table
INNER JOIN
another_table
ON
my_table.field_one = another_table.another_field
AND
another_table.valid = true
WHERE
field_one  3
AND
field_two  10;

But that's just a nice-to-have.
-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Pavel Stehule
2012/8/7 Thom Brown t...@linux.com:
 On 7 August 2012 15:14, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 last year we are spoke about reusing pretty print view code for some queries.

 Here is patch:

 this patch is really short - it is nice. But - it works only with
 known database objects (probably we would it) and it doesn't format
 subqueries well


 postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
 z  where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
 z.a)', true, false);
  pg_pretty_query
 --
   SELECT x.a, z.a+
 FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
 FROM foo +
WHERE foo.a = z.a))
 (1 row)

 This looks odd:

 postgres=# SELECT pg_pretty_query('SELECT 1, (SELECT max(a.x) +
 greatest(2,3) FROM generate_series(4,10,2) a(x)) FROM
 generate_series(1,100) GROUP BY 1 ORDER BY 1, 2 USING  NULLS FIRST',
 true, false);
  pg_pretty_query
 --
   SELECT 1,  +
  ( SELECT max(a.x) + GREATEST(2, 3)  +
 FROM generate_series(4, 10, 2) a(x)) +
 FROM generate_series(1, 100) generate_series(generate_series)+
GROUP BY 1::integer   +
ORDER BY 1::integer, ( SELECT max(a.x) + GREATEST(2, 3)   +
 FROM generate_series(4, 10, 2) a(x)) NULLS FIRST
 (1 row)

 USING  is removed completely (or if I used DESC, NULLS FIRST is then
 removed instead), 2 in the order by is expanded to its full query,
 and generate_series when used in FROM is repeated with its own name as
 a parameter.  I'm also not sure about the spacing before each line.
 SELECT, FROM and GROUP BY all appear out of alignment from one
 another.

it is issue - probably we can start deserialization just from parser
stage, not from rewriter stage - but then code will be significantly
longer and we cannot reuse current code for pretty print view.


 Plus it would be nice if we could support something like the following style:

 SELECT
 field_one,
 field_two + field_three
 FROM
 my_table
 INNER JOIN
 another_table
 ON
 my_table.field_one = another_table.another_field
 AND
 another_table.valid = true
 WHERE
 field_one  3
 AND
 field_two  10;


it is second issue - probably there are more lovely styles - CELKO,
your and other. I am not sure if we can support more styles in core
(contrib should be better maybe).

Regards

Pavel

 But that's just a nice-to-have.
 --
 Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Bruce Momjian
On Tue, Aug  7, 2012 at 04:14:34PM +0200, Pavel Stehule wrote:
 Hello
 
 last year we are spoke about reusing pretty print view code for some queries.
 
 Here is patch:
 
 this patch is really short - it is nice. But - it works only with
 known database objects (probably we would it) and it doesn't format
 subqueries well
 
 
 postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
 z  where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
 z.a)', true, false);
  pg_pretty_query
 --
   SELECT x.a, z.a+
 FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
 FROM foo +
WHERE foo.a = z.a))
 (1 row)

I can see this as very useful for people reporting badly-formatted
queries to our email lists.  Great!

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Andrew Dunstan


On 08/07/2012 10:14 AM, Pavel Stehule wrote:

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well


postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z  where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
  pg_pretty_query
--
   SELECT x.a, z.a+
 FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
 FROM foo +
WHERE foo.a = z.a))
(1 row)


Good stuff. That's one less item on my TODO list :-)

I think we should have a version that lets you specify the wrap column, 
like pg_get_viewdef does. Possibly for this case we should even default 
it to 0 (wrap after each item) instead of 79 which it is for views.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread David Fetter
On Tue, Aug 07, 2012 at 04:54:12PM +0200, Pavel Stehule wrote:
 2012/8/7 Thom Brown t...@linux.com:
  On 7 August 2012 15:14, Pavel Stehule pavel.steh...@gmail.com wrote:
  Hello
 
  last year we are spoke about reusing pretty print view code for some 
  queries.
 
  Here is patch:
 
  this patch is really short - it is nice. But - it works only with
  known database objects (probably we would it) and it doesn't format
  subqueries well
 
 
  postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
  z  where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
  z.a)', true, false);
   pg_pretty_query
  --
SELECT x.a, z.a+
  FROM foo, foo x, x z +
 WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
  FROM foo +
 WHERE foo.a = z.a))
  (1 row)
 
  This looks odd:
 
  postgres=# SELECT pg_pretty_query('SELECT 1, (SELECT max(a.x) +
  greatest(2,3) FROM generate_series(4,10,2) a(x)) FROM
  generate_series(1,100) GROUP BY 1 ORDER BY 1, 2 USING  NULLS FIRST',
  true, false);
   pg_pretty_query
  --
SELECT 1,  +
   ( SELECT max(a.x) + GREATEST(2, 3)  +
  FROM generate_series(4, 10, 2) a(x)) +
  FROM generate_series(1, 100) generate_series(generate_series)+
 GROUP BY 1::integer   +
 ORDER BY 1::integer, ( SELECT max(a.x) + GREATEST(2, 3)   +
  FROM generate_series(4, 10, 2) a(x)) NULLS FIRST
  (1 row)
 
  USING  is removed completely (or if I used DESC, NULLS FIRST is then
  removed instead), 2 in the order by is expanded to its full query,
  and generate_series when used in FROM is repeated with its own name as
  a parameter.  I'm also not sure about the spacing before each line.
  SELECT, FROM and GROUP BY all appear out of alignment from one
  another.
 
 it is issue - probably we can start deserialization just from parser
 stage, not from rewriter stage - but then code will be significantly
 longer and we cannot reuse current code for pretty print view.
 
 
  Plus it would be nice if we could support something like the following 
  style:
 
  SELECT
  field_one,
  field_two + field_three
  FROM
  my_table
  INNER JOIN
  another_table
  ON
  my_table.field_one = another_table.another_field
  AND
  another_table.valid = true
  WHERE
  field_one  3
  AND
  field_two  10;
 
 
 it is second issue - probably there are more lovely styles - CELKO,
 your and other. I am not sure if we can support more styles in core
 (contrib should be better maybe).

Would it be better to have output plugins and not privilege one?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Tue, Aug  7, 2012 at 04:14:34PM +0200, Pavel Stehule wrote:
 last year we are spoke about reusing pretty print view code for some queries.
 
 Here is patch:

 I can see this as very useful for people reporting badly-formatted
 queries to our email lists.  Great!

Allow me to express a contrary opinion: I think this is a bad idea.

* First off, packaging it as a SQL function that takes and returns text
seems rather awkward to use.  A lot of places where you might wish for
a SQL pretty-printer aren't going to have a database connection handy
(think emacs SQL mode).

* The functionality provided is not merely a pretty-printer but sort
of a query validator as well: the function will fail if the query refers
to any tables, columns, functions, etc you don't have in your database.
For some applications that's fine, but others not so much --- in
particular I suspect it's nigh useless for the use-case you mention of
quickly turning an emailed query into something legible.  And there's
no way to separate out the reformatting functionality from that.

* As per some of the complaints already registered in this thread,
ruleutils.c is not designed with the goal of being a pretty-printer.
Its primary charter is to support pg_dump by regurgitating rules/views
in an unambiguous form, which does not necessarily look very similar to
what was entered.  An example of a transformation that probably nobody
would want in a typical pretty-printing context is expansion of
SELECT * lists.  But again, there is really no way to turn that off.
Another aspect that seems pretty undesirable for pretty-printing is
loss of any comments embedded in the query text.

I'm very much not in favor of trying to make ruleutils serve two
masters, but that's the game we will be playing if we accept this patch.

In short, the only redeeming value of this patch is that it's short.
The functionality it provides is not something that anyone would come
up with in a green-field design for a pretty-printer, and if we take
it we are going to be faced with a whole lot of redesign requests that
will be painful to implement and will carry heavy risks of breaking
pg_dump and/or EXPLAIN.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Andrew Dunstan


On 08/07/2012 02:14 PM, Tom Lane wrote:


* As per some of the complaints already registered in this thread,
ruleutils.c is not designed with the goal of being a pretty-printer.
Its primary charter is to support pg_dump by regurgitating rules/views
in an unambiguous form, which does not necessarily look very similar to
what was entered.  An example of a transformation that probably nobody
would want in a typical pretty-printing context is expansion of
SELECT * lists.  But again, there is really no way to turn that off.
Another aspect that seems pretty undesirable for pretty-printing is
loss of any comments embedded in the query text.

I'm very much not in favor of trying to make ruleutils serve two
masters, but that's the game we will be playing if we accept this patch.


I think this horse has probably bolted. If you wanted to segregate off 
this functionality we shouldn't have used things like pg_get_viewdef in 
psql, ISTM.





In short, the only redeeming value of this patch is that it's short.
The functionality it provides is not something that anyone would come
up with in a green-field design for a pretty-printer, and if we take
it we are going to be faced with a whole lot of redesign requests that
will be painful to implement and will carry heavy risks of breaking
pg_dump and/or EXPLAIN.




One of the challenges is to have a pretty printer that is kept in sync 
with the dialect that's supported. Anything that doesn't use the 
backend's parser seems to me to be guaranteed to get out of sync very 
quickly.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 08/07/2012 02:14 PM, Tom Lane wrote:
 In short, the only redeeming value of this patch is that it's short.

 One of the challenges is to have a pretty printer that is kept in sync 
 with the dialect that's supported. Anything that doesn't use the 
 backend's parser seems to me to be guaranteed to get out of sync very 
 quickly.

Sure.  I think if we wanted an actually engineered solution, rather than
a half-baked one, ecpg provides a good source of inspiration.  One could
imagine a standalone program that reads a query on stdin and emits a
pretty-printed version to stdout, using a parser that is automatically
generated from the backend's grammar with much the same technology used
in recent ecpg releases.  I think that would address most of the
complaints I raised: it would be relatively painless to make use of from
contexts that don't have a live database connection, it wouldn't impose
any constraints related to having suitable database content available,
it wouldn't apply any of the multitude of implementation-dependent
transformations that the backend's parser does, and it could be built
(I think) to do something more with comments than just throw them away.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Peter Geoghegan
On 7 August 2012 20:01, Andrew Dunstan and...@dunslane.net wrote:
 On 08/07/2012 02:14 PM, Tom Lane wrote:
 * As per some of the complaints already registered in this thread,
 ruleutils.c is not designed with the goal of being a pretty-printer.
 Its primary charter is to support pg_dump by regurgitating rules/views
 in an unambiguous form, which does not necessarily look very similar to
 what was entered.  An example of a transformation that probably nobody
 would want in a typical pretty-printing context is expansion of
 SELECT * lists.  But again, there is really no way to turn that off.
 Another aspect that seems pretty undesirable for pretty-printing is
 loss of any comments embedded in the query text.

 I'm very much not in favor of trying to make ruleutils serve two
 masters, but that's the game we will be playing if we accept this patch.

+1. A pretty-printer that makes the query to be cleaned-up actually
undergo parse-analysis seems misconceived to me. This is a tool that
begs to be written in something like Python, as a satellite project,
with much greater flexibility in the format of the SQL that it
outputs.

 One of the challenges is to have a pretty printer that is kept in sync with
 the dialect that's supported. Anything that doesn't use the backend's parser
 seems to me to be guaranteed to get out of sync very quickly.

I'm not convinced of that. Consider the example of cscope, a popular
tool for browsing C code. Its parser was written to be fuzzy, so
it's actually perfectly usable for C++ and Java, even though that
isn't actually supported, IIRC. Now, I'll grant you that that isn't a
perfectly analogous situation, but it is similar in some ways. If we
add a new clause to select and that bit is generically pretty-printed,
is that really so bad? I have a hard time believing that a well
written fuzzy pretty-printer for Postgres wouldn't deliver the vast
majority of the benefits of an in-core approach, at a small fraction
of the effort. You'd also be able to pretty-print plpgsql function
definitions (a particularly compelling case for this kind of tool),
which any approach based on the backends grammar will never be able to
do (except, perhaps, if you were to do something even more
complicated).

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 On 7 August 2012 20:01, Andrew Dunstan and...@dunslane.net wrote:
 One of the challenges is to have a pretty printer that is kept in sync with
 the dialect that's supported. Anything that doesn't use the backend's parser
 seems to me to be guaranteed to get out of sync very quickly.

 I'm not convinced of that. Consider the example of cscope, a popular
 tool for browsing C code. Its parser was written to be fuzzy, so
 it's actually perfectly usable for C++ and Java, even though that
 isn't actually supported, IIRC. Now, I'll grant you that that isn't a
 perfectly analogous situation, but it is similar in some ways.

Yeah.  A related question here is whether you want a pretty printer that
is entirely unforgiving of (what it thinks are) syntax errors in the
input.  It might be a lot more useful if it didn't spit up on that, but
just did the best it could.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Pavel Stehule
2012/8/7 Tom Lane t...@sss.pgh.pa.us:
 Andrew Dunstan and...@dunslane.net writes:
 On 08/07/2012 02:14 PM, Tom Lane wrote:
 In short, the only redeeming value of this patch is that it's short.

 One of the challenges is to have a pretty printer that is kept in sync
 with the dialect that's supported. Anything that doesn't use the
 backend's parser seems to me to be guaranteed to get out of sync very
 quickly.

 Sure.  I think if we wanted an actually engineered solution, rather than
 a half-baked one, ecpg provides a good source of inspiration.  One could
 imagine a standalone program that reads a query on stdin and emits a
 pretty-printed version to stdout, using a parser that is automatically
 generated from the backend's grammar with much the same technology used
 in recent ecpg releases.  I think that would address most of the
 complaints I raised: it would be relatively painless to make use of from
 contexts that don't have a live database connection, it wouldn't impose
 any constraints related to having suitable database content available,
 it wouldn't apply any of the multitude of implementation-dependent
 transformations that the backend's parser does, and it could be built
 (I think) to do something more with comments than just throw them away.

+1

it is better, and it is allow more space for possible styling

Pavel



 regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: pg_pretty_query

2012-08-07 Thread Pavel Stehule
2012/8/7 Peter Geoghegan pe...@2ndquadrant.com:
 On 7 August 2012 20:01, Andrew Dunstan and...@dunslane.net wrote:
 On 08/07/2012 02:14 PM, Tom Lane wrote:
 * As per some of the complaints already registered in this thread,
 ruleutils.c is not designed with the goal of being a pretty-printer.
 Its primary charter is to support pg_dump by regurgitating rules/views
 in an unambiguous form, which does not necessarily look very similar to
 what was entered.  An example of a transformation that probably nobody
 would want in a typical pretty-printing context is expansion of
 SELECT * lists.  But again, there is really no way to turn that off.
 Another aspect that seems pretty undesirable for pretty-printing is
 loss of any comments embedded in the query text.

 I'm very much not in favor of trying to make ruleutils serve two
 masters, but that's the game we will be playing if we accept this patch.

 +1. A pretty-printer that makes the query to be cleaned-up actually
 undergo parse-analysis seems misconceived to me. This is a tool that
 begs to be written in something like Python, as a satellite project,
 with much greater flexibility in the format of the SQL that it
 outputs.

 One of the challenges is to have a pretty printer that is kept in sync with
 the dialect that's supported. Anything that doesn't use the backend's parser
 seems to me to be guaranteed to get out of sync very quickly.

 I'm not convinced of that. Consider the example of cscope, a popular
 tool for browsing C code. Its parser was written to be fuzzy, so
 it's actually perfectly usable for C++ and Java, even though that
 isn't actually supported, IIRC. Now, I'll grant you that that isn't a
 perfectly analogous situation, but it is similar in some ways. If we
 add a new clause to select and that bit is generically pretty-printed,
 is that really so bad? I have a hard time believing that a well
 written fuzzy pretty-printer for Postgres wouldn't deliver the vast
 majority of the benefits of an in-core approach, at a small fraction
 of the effort. You'd also be able to pretty-print plpgsql function
 definitions (a particularly compelling case for this kind of tool),
 which any approach based on the backends grammar will never be able to
 do (except, perhaps, if you were to do something even more
 complicated).

I disagree - simply pretty printer based on technique that we know
from ecpg can be very cheep and it cannot be fuzzy because it
integrate PostgreSQL parser.

Pavel


 --
 Peter Geoghegan   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers