Re: [HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-05 Thread Tom Lane
I wrote:
 Andrew Dunstan and...@dunslane.net writes:
 If we do need a precedence setting for NULL_P, then I think it should 
 probably be on its own and not sharing one with IS.

 Yeah, I was thinking that too.  If we put %prec on the IS [NOT] NULL
 productions then there is no need for NULL_P to have exactly its current
 precedence; anything above POSTFIXOP would preserve the current behavior
 in the DEFAULT ... NULL case.  (And if we decided we wanted to flip that
 behavior, anything below POSTFIXOP would do that.)

On reflection I decided that the best quick-fix is to put NULL into the
list of keywords that are already precedence-grouped with IDENT.  That
at least makes sure that it has precedence behavior equivalent to any
plain old non-keyword.  If you can find a better fix, maybe we could
apply it to the other cases mentioned there as well.

 BTW, I wonder why NOTNULL and ISNULL have their own precedence levels,
 rather than being made to act exactly like IS [NOT] NULL ...

Is anybody up for changing that, or should we leave well enough alone?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-04 Thread Tom Lane
While looking at the grammar's operator-precedence declarations in
connection with a recent pgsql-docs question, it struck me that this
declaration is a foot-gun waiting to go off:

%nonassoc   IS NULL_P TRUE_P FALSE_P UNKNOWN /* sets precedence for IS 
NULL, etc */

The only terminal that we actually need to attach precedence to for
IS NULL and related productions is IS.  The others are just listed
there to save attaching explicit %prec declarations to those productions.
This seems like a bad idea, because attaching a precedence to a terminal
symbol that doesn't absolutely have to have one is just asking for
trouble: it can cause bison to accept grammars that would better have
provoked a shift/reduce error, and to silently resolve the ambiguity in
a way that you maybe didn't expect.

So I thought to myself that it'd be better to remove the unnecessary
precedence markings, and tried, with the attached patch.  And behold,
I got a shift/reduce conflict:

state 2788

  1569 b_expr: b_expr qual_Op . b_expr
  1571   | b_expr qual_Op .

NULL_P shift, and go to state 1010
NULL_P[reduce using rule 1571 (b_expr)]

So the god of unintended consequences has been here already.  What this
is telling us is that in examples such as

CREATE TABLE foo (f1 int DEFAULT 10 %% NULL);

it is not immediately clear to bison whether to shift upon seeing the
NULL (which leads to a parse tree that says %% is an infix operator with
arguments 10 and NULL), or to reduce (which leads to a parse tree that
says that %% is a postfix operator with argument 10, and NULL is a
column declaration constraint separate from the DEFAULT expression).

If you try the experiment, you find out that the first interpretation is
preferred by the current grammar:

ERROR:  operator does not exist: integer %% unknown

Now, this is probably a good thing, because NULL as a column declaration
constraint is not SQL standard (only NOT NULL is), so we're resolving
the ambiguity in a way that's more likely to be SQL-compatible.  But it
could be surprising to somebody who expected the other behavior,
especially since this seemingly-closely-related command is parsed the
other way:

CREATE TABLE foo (f1 int DEFAULT 10 %% NOT NULL);
ERROR:  operator does not exist: integer %%

And the reason for that difference in behavior is that NOT has a
declared precedence lower than POSTFIXOP, whereas NULL has a declared
precedence that's higher.  That comparison determines how bison resolves
the shift/reduce conflict.

The fact that this behavior falls out of a precedence declaration that's
seemingly quite unrelated, and was surely not designed with this case in
mind, is a perfect example of why I say that precedence declarations are
hazardous.

So I'd still like to get rid of the precedence markings for TRUE_P,
FALSE_P, and UNKNOWN, and will do so unless somebody has a really good
reason not to.  (It looks like we could avoid marking ZONE, too.)  But
I would be happier if we could also not mark NULL, because that's surely
used in a lot of other places, and could easily bite us a lot harder
than this.  Can anyone think of an alternative way to resolve this
particular conflict without the blunt instrument of a precedence marking?

regards, tom lane

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 933a1a2..2fb0418 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*** static void SplitColQualList(List *qualL
*** 614,620 
  %left		Op OPERATOR		/* multi-character ops and user-defined operators */
  %nonassoc	NOTNULL
  %nonassoc	ISNULL
! %nonassoc	IS NULL_P TRUE_P FALSE_P UNKNOWN /* sets precedence for IS NULL, etc */
  %left		'+' '-'
  %left		'*' '/' '%'
  %left		'^'
--- 614,620 
  %left		Op OPERATOR		/* multi-character ops and user-defined operators */
  %nonassoc	NOTNULL
  %nonassoc	ISNULL
! %nonassoc	IS/* sets precedence for IS NULL, etc */
  %left		'+' '-'
  %left		'*' '/' '%'
  %left		'^'
*** a_expr:		c_expr	{ $$ = $1; }
*** 9887,9893 
  			 *	a ISNULL
  			 *	a NOTNULL
  			 */
! 			| a_expr IS NULL_P
  {
  	NullTest *n = makeNode(NullTest);
  	n-arg = (Expr *) $1;
--- 9887,9893 
  			 *	a ISNULL
  			 *	a NOTNULL
  			 */
! 			| a_expr IS NULL_P			%prec IS
  {
  	NullTest *n = makeNode(NullTest);
  	n-arg = (Expr *) $1;
*** a_expr:		c_expr	{ $$ = $1; }
*** 9901,9907 
  	n-nulltesttype = IS_NULL;
  	$$ = (Node *)n;
  }
! 			| a_expr IS NOT NULL_P
  {
  	NullTest *n = makeNode(NullTest);
  	n-arg = (Expr *) $1;
--- 9901,9907 
  	n-nulltesttype = IS_NULL;
  	$$ = (Node *)n;
  }
! 			| a_expr IS NOT NULL_P		%prec IS
  {
  	NullTest *n = makeNode(NullTest);
  	n-arg = (Expr *) $1;
*** a_expr:		c_expr	{ $$ = $1; }
*** 9919,9960 
  {
  	$$ = (Node 

Re: [HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-04 Thread Andrew Dunstan



On 05/04/2011 07:39 PM, Tom Lane wrote:

While looking at the grammar's operator-precedence declarations in
connection with a recent pgsql-docs question, it struck me that this
declaration is a foot-gun waiting to go off:

%nonassoc   IS NULL_P TRUE_P FALSE_P UNKNOWN /* sets precedence for IS 
NULL, etc */

The only terminal that we actually need to attach precedence to for
IS NULL and related productions is IS.  The others are just listed
there to save attaching explicit %prec declarations to those productions.
This seems like a bad idea, because attaching a precedence to a terminal
symbol that doesn't absolutely have to have one is just asking for
trouble: it can cause bison to accept grammars that would better have
provoked a shift/reduce error, and to silently resolve the ambiguity in
a way that you maybe didn't expect.

So I thought to myself that it'd be better to remove the unnecessary
precedence markings, and tried, with the attached patch.  And behold,
I got a shift/reduce conflict:

state 2788

   1569 b_expr: b_expr qual_Op . b_expr
   1571   | b_expr qual_Op .

 NULL_P shift, and go to state 1010
 NULL_P[reduce using rule 1571 (b_expr)]

So the god of unintended consequences has been here already.  What this
is telling us is that in examples such as

CREATE TABLE foo (f1 int DEFAULT 10 %% NULL);

it is not immediately clear to bison whether to shift upon seeing the
NULL (which leads to a parse tree that says %% is an infix operator with
arguments 10 and NULL), or to reduce (which leads to a parse tree that
says that %% is a postfix operator with argument 10, and NULL is a
column declaration constraint separate from the DEFAULT expression).

If you try the experiment, you find out that the first interpretation is
preferred by the current grammar:

ERROR:  operator does not exist: integer %% unknown


Yeah, IIRC the default action for a shift/reduce conflict is to shift, 
as it's usually the right thing to do.



Now, this is probably a good thing, because NULL as a column declaration
constraint is not SQL standard (only NOT NULL is), so we're resolving
the ambiguity in a way that's more likely to be SQL-compatible.  But it
could be surprising to somebody who expected the other behavior,
especially since this seemingly-closely-related command is parsed the
other way:

CREATE TABLE foo (f1 int DEFAULT 10 %% NOT NULL);
ERROR:  operator does not exist: integer %%

And the reason for that difference in behavior is that NOT has a
declared precedence lower than POSTFIXOP, whereas NULL has a declared
precedence that's higher.  That comparison determines how bison resolves
the shift/reduce conflict.

The fact that this behavior falls out of a precedence declaration that's
seemingly quite unrelated, and was surely not designed with this case in
mind, is a perfect example of why I say that precedence declarations are
hazardous.

So I'd still like to get rid of the precedence markings for TRUE_P,
FALSE_P, and UNKNOWN, and will do so unless somebody has a really good
reason not to.  (It looks like we could avoid marking ZONE, too.)  But
I would be happier if we could also not mark NULL, because that's surely
used in a lot of other places, and could easily bite us a lot harder
than this.  Can anyone think of an alternative way to resolve this
particular conflict without the blunt instrument of a precedence marking?




My bison-fu is a bit rusty, but years ago I could do this stuff in my 
sleep. I'll be surprised if there isn't a way.


If we do need a precedence setting for NULL_P, then I think it should 
probably be on its own and not sharing one with IS.


If you don't solve it soon I'll take a look after I clear a couple of 
higher priority items from my list.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-04 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 05/04/2011 07:39 PM, Tom Lane wrote:
 If you try the experiment, you find out that the first interpretation is
 preferred by the current grammar:
 ERROR:  operator does not exist: integer %% unknown

 Yeah, IIRC the default action for a shift/reduce conflict is to shift, 
 as it's usually the right thing to do.

Well, there's nothing default about it: we've got precedence
declarations that specifically tell it to do that.  I'm just disturbed
because this isn't what those precedences were meant to do.

 I would be happier if we could also not mark NULL, because that's surely
 used in a lot of other places, and could easily bite us a lot harder
 than this.  Can anyone think of an alternative way to resolve this
 particular conflict without the blunt instrument of a precedence marking?

 My bison-fu is a bit rusty, but years ago I could do this stuff in my 
 sleep. I'll be surprised if there isn't a way.

 If we do need a precedence setting for NULL_P, then I think it should 
 probably be on its own and not sharing one with IS.

Yeah, I was thinking that too.  If we put %prec on the IS [NOT] NULL
productions then there is no need for NULL_P to have exactly its current
precedence; anything above POSTFIXOP would preserve the current behavior
in the DEFAULT ... NULL case.  (And if we decided we wanted to flip that
behavior, anything below POSTFIXOP would do that.)

But it would still make life safer for future grammar hacking if it
didn't have precedence at all.

BTW, I wonder why NOTNULL and ISNULL have their own precedence levels,
rather than being made to act exactly like IS [NOT] NULL ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-04 Thread Greg Stark
On Thu, May 5, 2011 at 12:39 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 So I'd still like to get rid of the precedence markings for TRUE_P,
 FALSE_P, and UNKNOWN, and will do so unless somebody has a really good
 reason not to.  (It looks like we could avoid marking ZONE, too.)  But
 I would be happier if we could also not mark NULL, because that's surely
 used in a lot of other places, and could easily bite us a lot harder
 than this.  Can anyone think of an alternative way to resolve this
 particular conflict without the blunt instrument of a precedence marking?


Isn't there already some gadget which forces postfix operators to be
discouraged compared to some other interpretation in other cases? That
would be the opposite of the current interpretation though which you
said you preferred.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-04 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 Isn't there already some gadget which forces postfix operators to be
 discouraged compared to some other interpretation in other cases?

Yeah.  I'm not unhappy with the current grammar's behavior in this case.
What's bothering me is that the implementation seems likely to create
surprising/unexpected behaviors after future grammar changes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some surprising precedence behavior in PG's grammar

2011-05-04 Thread Greg Stark
On Thu, May 5, 2011 at 4:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 Isn't there already some gadget which forces postfix operators to be
 discouraged compared to some other interpretation in other cases?

 Yeah.  I'm not unhappy with the current grammar's behavior in this case.
 What's bothering me is that the implementation seems likely to create
 surprising/unexpected behaviors after future grammar changes.

I do wonder how much we really gain from having postfix operators.
Other than ! I've never seen one and of course anyone who wanted to
use one could just as easily use a prefix operator. In practice I
think most unary operators are just special cases of binary operators
anyways and often once you have the binary operator it's clearer to
just use that anyways.

A *lot* of grammar conflicts we've had to worry about end up going
away if we didn't have postfix operators.
-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers