Re: [GENERAL] question on parsing postgres sql queries

2016-07-27 Thread Alvaro Herrera
Kevin Grittner wrote:

> On the other hand, try connecting to a database with
> psql and typing:
> 
> \h create index
> 
> ... (or any other command name).  The help you get there is fished
> out of the docs.

BTW I noticed a few days ago that we don't have a "where BLAH can be one
of" section for the window_definition replaceable term in the help for
SELECT.  We omit these sections for trivial clauses, but I think WINDOW
is elaborate enough that it should have one.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] question on parsing postgres sql queries

2016-07-27 Thread Kevin Grittner
On Tue, Jul 26, 2016 at 4:20 AM, Jerome Wagner  wrote:

> I am doing some research on postgres sql query parsing.

> I was wondering what people think of the conformance with regards to the
> real parser of the documentations on
>  - https://www.postgresql.org/docs/current/static/sql-select.html
>  - https://www.postgresql.org/docs/current/static/sql-copy.html
> ... and more generally sgmls in
> https://github.com/postgres/postgres/tree/master/doc/src/sgml/ref
>
> Would it make sense to use these sgml synopsis as some kind of source of
> truth, parse them, and automatically generate a parser for a specifc
> language ?

It might be interesting to do as an academic exercise or to audit
the accuracy of the synopses, but I don't think it's practical for
generating production-quality parsers -- at least in the short
term.  Besides issues mentioned by others (e.g., parser support for
legacy syntax we don't want to document or encourage), we sometimes
allow things through the parser so that we can code more
user-friendly messages off of the parse tree than a generated
parser would provide.

I also don't remember seeing anyone mention the problems with
forward references and metadata from system catalogs.  These either
need to be handled by a "rewind and try again" approach or (better
IMO) an additional pass or two walking the parse tree to emit a
version where generic "place-holders" are replaced by something
more specific.  See the "parse analysis" and "rewrite" steps in
PostgreSQL for how that is currently handled.  Before working in
the PostgreSQL source I had helped develop a SQL parser in ANTLR,
where the same basic parser generator is used for lexer, parser,
and tree-walker phases (using pretty much the same grammar
specifier for all of them), just taking characters, tokens, or
parse tree nodes as input -- automatic generation of "main" parser
might be feasible in such an environment (possibly with some sort
of annotations or hand-written light initial parsing phase), but I
think the later tree walkers would need to be hand-coded.

> I feel like the conformance level of the documentation is high and that the
> sgml synopis seem to be nearly programmatically sufficient to create
> parsers.
>
> what do you think ?

Nearly.

> Could the parser commiters share some lights on how the documentation
> process interacts with the parser commits ?

There is no automated interaction there -- it depends on human
attention.  On the other hand, try connecting to a database with
psql and typing:

\h create index

... (or any other command name).  The help you get there is fished
out of the docs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] question on parsing postgres sql queries

2016-07-27 Thread Jerome Wagner
Hello,

I am doing some research on postgres sql query parsing.

I have found the https://github.com/lfittl/libpg_query project which
manages to re-use the native postgres server parser. For using this, you
need to accept an external dependency on a lib compiled out of the postgres
source.

I was wondering what people think of the conformance with regards to the
real parser of the documentations on
 - https://www.postgresql.org/docs/current/static/sql-select.html
 - https://www.postgresql.org/docs/current/static/sql-copy.html
... and more generally sgmls in
https://github.com/postgres/postgres/tree/master/doc/src/sgml/ref

Would it make sense to use these sgml synopsis as some kind of source of
truth, parse them, and automatically generate a parser for a specifc
language ?

This could enable the creation of parsers for different languages using
parser generators based on the synopsis.

I feel like the conformance level of the documentation is high and that the
sgml synopis seem to be nearly programmatically sufficient to create
parsers.

what do you think ?

Could the parser commiters share some lights on how the documentation
process interacts with the parser commits ?

Thanks,
Jerome


Re: [GENERAL] question on parsing postgres sql queries

2016-07-27 Thread Jerome Wagner
> What problem are you trying to solve here?​  to whit not everything that
can be parsed is documented - usually intentionally.

I am tyring to see whether we could use the documentation as a kind of
formal specification of the language but I understand that the devil is
in the details and that even formal specifications can lead to incompatible
implementations,

I would have found it nice if the clean documentation of the project could
be used as a meta-grammar sufficient to maybe generate the grammar but I
will have to dig further into the Bison grammar files.

The project I mentioned that isolates the parser from PostgreSQL binary as
a re-usable library is probably the closest you can get currently to a
parser matching the real engine.

Otherwise, yes, parsing the synopsis could maybe lead to a sanity check on
the fact that the documentation is in line with the grammar. This could
lead to warnings or help uncover unexpected corner cases not mentioned in
the documentation.

Thanks for your answers
Jerome


On Tue, Jul 26, 2016 at 9:52 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 26, 2016 at 3:20 PM, Jerome Wagner 
> wrote:
>
>>
>> Would it make sense to use these sgml synopsis as some kind of source of
>> truth, parse them, and automatically generate a parser for a specifc
>> language ?
>>
>
> ​What problem are you trying to solve here?​  to whit not everything that
> can be parsed is documented - usually intentionally.
>
>
>> Could the parser commiters share some lights on how the documentation
>> process interacts with the parser commits ?
>>
>>
> ​Commits that modify the parser are expected to have manual modifications
> to the relevant documentation ​as well.
>
> David J.
>
>


Re: [GENERAL] question on parsing postgres sql queries

2016-07-26 Thread David G. Johnston
On Tue, Jul 26, 2016 at 3:20 PM, Jerome Wagner 
wrote:

>
> Would it make sense to use these sgml synopsis as some kind of source of
> truth, parse them, and automatically generate a parser for a specifc
> language ?
>

​What problem are you trying to solve here?​  to whit not everything that
can be parsed is documented - usually intentionally.


> Could the parser commiters share some lights on how the documentation
> process interacts with the parser commits ?
>
>
​Commits that modify the parser are expected to have manual modifications
to the relevant documentation ​as well.

David J.


Re: [GENERAL] question on parsing postgres sql queries

2016-07-26 Thread Tom Lane
Jerome Wagner  writes:
> Would it make sense to use these sgml synopsis as some kind of source of
> truth, parse them, and automatically generate a parser for a specifc
> language ?

Probably not.  First, it is not uncommon for corner cases (such as
legacy syntaxes) to go unmentioned in the documentation.  Second, the
implementation is often encrusted with details we'd just as soon not
expose to users.  An example here is the need to be very specific in
the Bison grammar about whether extra parens in a "foo IN ((SELECT ...))"
construct belong to the SELECT or the IN.

It might be nice to have some sort of tool that could check compatibility
of the doc synopses with the actual grammar.  But I doubt that trying to
auto-generate either one from the other would be a win.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] question on parsing postgres sql queries

2016-07-26 Thread Jerome Wagner
Hello,

I am doing some research on postgres sql query parsing.

I have found the https://github.com/lfittl/libpg_query project which
manages to re-use the native postgres server parser. For using this, you
need to accept an external dependency on a lib compiled out of the postgres
source.

I was wondering what people think of the conformance with regards to the
real parser of the documentations on
 - https://www.postgresql.org/docs/current/static/sql-select.html
 - https://www.postgresql.org/docs/current/static/sql-copy.html
... and more generally sgmls in
https://github.com/postgres/postgres/tree/master/doc/src/sgml/ref

Would it make sense to use these sgml synopsis as some kind of source of
truth, parse them, and automatically generate a parser for a specifc
language ?

This could enable the creation of parsers for different languages using
parser generators based on the synopsis.

I feel like the conformance level of the documentation is high and that the
sgml synopis seem to be nearly programmatically sufficient to create
parsers.

what do you think ?

Could the parser commiters share some lights on how the documentation
process interacts with the parser commits ?

Thanks,
Jerome