Dear Tom,

> In particular you can't any longer tell the difference between BOOLEAN
> and "boolean" (with quotes), which are not the same thing --- a quoted
> string is never a keyword, per spec. [...]

Ok, so you mean that on -boolean- the lexer returns a BOOLEAN_P token, but
with -"boolean"- it returns an Ident and -boolean- as a lval. Indeed, in
such a case I cannot recognize that simply boolean vs "boolean" if they
are both idents that look the same.

As a matter of fact, this can also be fixed with some post-filtering. Say,
all quoted idents could be returned with a leading " to show it was
dquoted, and the IDENT rules in the parser could remove when it is not
needed anymore to distinguish the case.

Not beautiful, I agree, but my point is that the current number of tokens
and number of states and automaton size are not inherent to SQL but to the
way the lexing/parsing is performed in postgresql.

> The basic point here is that eliminating tokens as you propose will
> result in small changes in behavior, none of which are good or per spec.
> Making the parser automaton smaller would be nice, but not at that
> price.

Ok. I don't want to change the spec. I still stand that it can be done,
although some more twicking is required. It was just a "proof of concept",
not a patch submission. Well, a "proof of concept" must still be a proof;-)

I attach a small patch that solve the boolean vs "boolean" issue, still as
a proof of concept that it is 'doable' to preserve semantics with a
different lexer/parser balance. I don't claim that it should be applied, I
just claim that the automaton size could be smaller, especially by
shortening the "unreserved_keyword" list.

> You have not proven that you can have the same result.

Well, I passed the regression tests, but that does not indeed prove
anything, because these issues are not tested at all.

Maybe you could consider to add the "regression" part of the attached
patcht, which creates a user "boolean" type.

Anyway, my motivation is about "hints" and "advises", and that does not
help a lot to solve these issues.

-- 
Fabien.
*** ./src/backend/parser/gram.y.orig    Tue Apr  6 18:15:39 2004
--- ./src/backend/parser/gram.y Tue Apr  6 17:56:46 2004
***************
*** 95,100 ****
--- 95,102 ----
  static Node *doNegate(Node *n);
  static void doNegateFloat(Value *v);
  
+ #define clean_dqname(n) (((*(n))!='"')? (n): pstrdup((n)+1))
+ 
  %}
  
  
***************
*** 336,343 ****
        AGGREGATE ALL ALSO ALTER ANALYSE ANALYZE AND ANY ARRAY AS ASC
        ASSERTION ASSIGNMENT AT AUTHORIZATION
  
!       BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
!       BOOLEAN_P BOTH BY
  
        CACHE CALLED CASCADE CASE CAST CHAIN CHAR_P
        CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
--- 338,345 ----
        AGGREGATE ALL ALSO ALTER ANALYSE ANALYZE AND ANY ARRAY AS ASC
        ASSERTION ASSIGNMENT AT AUTHORIZATION
  
!       BACKWARD BEFORE BEGIN_P BETWEEN BINARY BIT
!       BOTH BY
  
        CACHE CALLED CASCADE CASE CAST CHAIN CHAR_P
        CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
***************
*** 362,368 ****
  
        ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT
        INDEX INHERITS INITIALLY INNER_P INOUT INPUT_P
!       INSENSITIVE INSERT INSTEAD INT_P INTEGER INTERSECT
        INTERVAL INTO INVOKER IS ISNULL ISOLATION
  
        JOIN
--- 364,370 ----
  
        ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT
        INDEX INHERITS INITIALLY INNER_P INOUT INPUT_P
!       INSENSITIVE INSERT INSTEAD INTERSECT
        INTERVAL INTO INVOKER IS ISNULL ISOLATION
  
        JOIN
***************
*** 386,398 ****
        PRECISION PRESERVE PREPARE PRIMARY 
        PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
!       READ REAL RECHECK REFERENCES REINDEX RELATIVE_P RENAME REPEATABLE REPLACE
        RESET RESTART RESTRICT RETURNS REVOKE RIGHT ROLLBACK ROW ROWS
        RULE
  
        SCHEMA SCROLL SECOND_P SECURITY SELECT SEQUENCE
        SERIALIZABLE SESSION SESSION_USER SET SETOF SHARE
!       SHOW SIMILAR SIMPLE SMALLINT SOME STABLE START STATEMENT
        STATISTICS STDIN STDOUT STORAGE STRICT_P SUBSTRING SYSID
  
        TABLE TEMP TEMPLATE TEMPORARY THEN TIME TIMESTAMP
--- 388,400 ----
        PRECISION PRESERVE PREPARE PRIMARY 
        PRIOR PRIVILEGES PROCEDURAL PROCEDURE
  
!       READ RECHECK REFERENCES REINDEX RELATIVE_P RENAME REPEATABLE REPLACE
        RESET RESTART RESTRICT RETURNS REVOKE RIGHT ROLLBACK ROW ROWS
        RULE
  
        SCHEMA SCROLL SECOND_P SECURITY SELECT SEQUENCE
        SERIALIZABLE SESSION SESSION_USER SET SETOF SHARE
!       SHOW SIMILAR SIMPLE SOME STABLE START STATEMENT
        STATISTICS STDIN STDOUT STORAGE STRICT_P SUBSTRING SYSID
  
        TABLE TEMP TEMPLATE TEMPORARY THEN TIME TIMESTAMP
***************
*** 959,965 ****
                                }
                        | IDENT
                                {
!                                       $$ = makeStringConst($1, NULL);
                                }
                        | ConstInterval Sconst opt_interval
                                {
--- 961,967 ----
                                }
                        | IDENT
                                {
!                                       $$ = makeStringConst(clean_dqname($1), NULL);
                                }
                        | ConstInterval Sconst opt_interval
                                {
***************
*** 3196,3202 ****
                        | type_name attrs '%' TYPE_P
                                {
                                        $$ = makeNode(TypeName);
!                                       $$->names = lcons(makeString($1), $2);
                                        $$->pct_type = true;
                                        $$->typmod = -1;
                                }
--- 3198,3204 ----
                        | type_name attrs '%' TYPE_P
                                {
                                        $$ = makeNode(TypeName);
!                                       $$->names = 
lcons(makeString(clean_dqname($1)), $2);
                                        $$->pct_type = true;
                                        $$->typmod = -1;
                                }
***************
*** 5191,5197 ****
                        | type_name attrs
                                {
                                        $$ = makeNode(TypeName);
!                                       $$->names = lcons(makeString($1), $2);
                                        $$->typmod = -1;
                                }
                ;
--- 5193,5199 ----
                        | type_name attrs
                                {
                                        $$ = makeNode(TypeName);
!                                       $$->names = 
lcons(makeString(clean_dqname($1)), $2);
                                        $$->typmod = -1;
                                }
                ;
***************
*** 5216,5222 ****
  GenericType:
                        type_name
                                {
!                                       $$ = makeTypeName($1);
                                }
                ;
  
--- 5218,5236 ----
  GenericType:
                        type_name
                                {
!                                       if (strcasecmp($1,"boolean")==0)
!                                               $$ = SystemTypeName("bool");
!                                       else if (strcasecmp($1, "bigint")==0)
!                                               $$ = SystemTypeName("int8");
!                                       else if (strcasecmp($1, "integer")==0 ||
!                                                        strcasecmp($1, "int")==0)
!                                               $$ = SystemTypeName("int4");
!                                       else if (strcasecmp($1, "smallint")==0)
!                                               $$ = SystemTypeName("int2");
!                                       else if (strcasecmp($1, "real")==0)
!                                               $$ = SystemTypeName("float4");
!                                       else
!                                               $$ = makeTypeName(clean_dqname($1));
                                }
                ;
  
***************
*** 5225,5251 ****
   * - thomas 1997-09-18
   * Provide real DECIMAL() and NUMERIC() implementations now - Jan 1998-12-30
   */
! Numeric:      INT_P
!                               {
!                                       $$ = SystemTypeName("int4");
!                               }
!                       | INTEGER
!                               {
!                                       $$ = SystemTypeName("int4");
!                               }
!                       | SMALLINT
!                               {
!                                       $$ = SystemTypeName("int2");
!                               }
!                       | BIGINT
!                               {
!                                       $$ = SystemTypeName("int8");
!                               }
!                       | REAL
!                               {
!                                       $$ = SystemTypeName("float4");
!                               }
!                       | FLOAT_P opt_float
                                {
                                        $$ = $2;
                                }
--- 5239,5246 ----
   * - thomas 1997-09-18
   * Provide real DECIMAL() and NUMERIC() implementations now - Jan 1998-12-30
   */
! Numeric:
!                        FLOAT_P opt_float
                                {
                                        $$ = $2;
                                }
***************
*** 5268,5277 ****
                                        $$ = SystemTypeName("numeric");
                                        $$->typmod = $2;
                                }
-                       | BOOLEAN_P
-                               {
-                                       $$ = SystemTypeName("bool");
-                               }
                ;
  
  opt_float:    '(' Iconst ')'
--- 5263,5268 ----
***************
*** 6861,6867 ****
   */
  
  extract_arg:
!                       IDENT                                                          
         { $$ = $1; }
                        | YEAR_P                                                       
         { $$ = "year"; }
                        | MONTH_P                                                      
         { $$ = "month"; }
                        | DAY_P                                                        
         { $$ = "day"; }
--- 6852,6858 ----
   */
  
  extract_arg:
!                       IDENT                                                          
         { $$ = clean_dqname($1); }
                        | YEAR_P                                                       
         { $$ = "year"; }
                        | MONTH_P                                                      
         { $$ = "month"; }
                        | DAY_P                                                        
         { $$ = "day"; }
***************
*** 7346,7352 ****
  
  /* Column identifier --- names that can be column, table, etc names.
   */
! ColId:                IDENT                                                          
         { $$ = $1; }
                        | unreserved_keyword                                    { $$ = 
pstrdup($1); }
                        | col_name_keyword                                             
 { $$ = pstrdup($1); }
                ;
--- 7337,7343 ----
  
  /* Column identifier --- names that can be column, table, etc names.
   */
! ColId:                IDENT                                                          
         { $$ = clean_dqname($1); }
                        | unreserved_keyword                                    { $$ = 
pstrdup($1); }
                        | col_name_keyword                                             
 { $$ = pstrdup($1); }
                ;
***************
*** 7360,7366 ****
  /* Function identifier --- names that can be function names.
   */
  function_name:
!                       IDENT                                                          
         { $$ = $1; }
                        | unreserved_keyword                                    { $$ = 
pstrdup($1); }
                        | func_name_keyword                                            
 { $$ = pstrdup($1); }
                ;
--- 7351,7357 ----
  /* Function identifier --- names that can be function names.
   */
  function_name:
!                       IDENT                                                          
         { $$ = clean_dqname($1); }
                        | unreserved_keyword                                    { $$ = 
pstrdup($1); }
                        | func_name_keyword                                            
 { $$ = pstrdup($1); }
                ;
***************
*** 7368,7374 ****
  /* Column label --- allowed labels in "AS" clauses.
   * This presently includes *all* Postgres keywords.
   */
! ColLabel:     IDENT                                                                  
 { $$ = $1; }
                        | unreserved_keyword                                    { $$ = 
pstrdup($1); }
                        | col_name_keyword                                             
 { $$ = pstrdup($1); }
                        | func_name_keyword                                            
 { $$ = pstrdup($1); }
--- 7359,7365 ----
  /* Column label --- allowed labels in "AS" clauses.
   * This presently includes *all* Postgres keywords.
   */
! ColLabel:     IDENT                                                                  
 { $$ = clean_dqname($1); }
                        | unreserved_keyword                                    { $$ = 
pstrdup($1); }
                        | col_name_keyword                                             
 { $$ = pstrdup($1); }
                        | func_name_keyword                                            
 { $$ = pstrdup($1); }
***************
*** 7585,7593 ****
   * looks too much like a function call for an LR(1) parser.
   */
  col_name_keyword:
!                         BIGINT
!                       | BIT
!                       | BOOLEAN_P
                        | CHAR_P
                        | CHARACTER
                        | COALESCE
--- 7576,7582 ----
   * looks too much like a function call for an LR(1) parser.
   */
  col_name_keyword:
!                         BIT
                        | CHAR_P
                        | CHARACTER
                        | COALESCE
***************
*** 7598,7605 ****
                        | EXTRACT
                        | FLOAT_P
                        | INOUT
-                       | INT_P
-                       | INTEGER
                        | INTERVAL
                        | NATIONAL
                        | NCHAR
--- 7587,7592 ----
***************
*** 7610,7619 ****
                        | OVERLAY
                        | POSITION
                        | PRECISION
-                       | REAL
                        | ROW
                        | SETOF
-                       | SMALLINT
                        | SUBSTRING
                        | TIME
                        | TIMESTAMP
--- 7597,7604 ----
*** ./src/backend/parser/scan.l.orig    Tue Apr  6 18:15:09 2004
--- ./src/backend/parser/scan.l Tue Apr  6 17:36:12 2004
***************
*** 448,453 ****
--- 448,454 ----
                                        token_start = yytext;
                                        BEGIN(xd);
                                        startlit();
+                                       addlitchar('"');
                                }
  <xd>{xdstop}  {
                                        char               *ident;
*** ./src/test/regress/expected/domain.out.orig Tue Sep 30 00:06:40 2003
--- ./src/test/regress/expected/domain.out      Tue Apr  6 18:10:03 2004
***************
*** 300,302 ****
--- 300,325 ----
  drop domain ddef3 restrict;
  drop domain ddef4 restrict;
  drop domain ddef5 restrict;
+ --
+ -- a user "boolean" type
+ -- 
+ CREATE DOMAIN "boolean" AS 
+   TEXT DEFAULT 'yes' 
+   CHECK(VALUE = ANY (ARRAY['yes','no']));
+ CREATE TABLE calvin(i "boolean");
+ INSERT INTO calvin(i) VALUES(TRUE); -- fail
+ ERROR:  column "i" is of type boolean but expression is of type boolean
+ HINT:  You will need to rewrite or cast the expression.
+ INSERT INTO calvin(i) VALUES(FALSE); -- fail
+ ERROR:  column "i" is of type boolean but expression is of type boolean
+ HINT:  You will need to rewrite or cast the expression.
+ INSERT INTO calvin(i) VALUES('yes'); -- ok
+ INSERT INTO calvin(i) VALUES('no'); -- ok
+ SELECT COUNT(*)=2 FROM calvin; -- true
+  ?column? 
+ ----------
+  t
+ (1 row)
+ 
+ DROP TABLE calvin;
+ DROP DOMAIN "boolean";
*** ./src/test/regress/sql/domain.sql.orig      Mon Sep 15 02:26:31 2003
--- ./src/test/regress/sql/domain.sql   Tue Apr  6 18:08:47 2004
***************
*** 244,246 ****
--- 244,265 ----
  drop domain ddef3 restrict;
  drop domain ddef4 restrict;
  drop domain ddef5 restrict;
+ 
+ --
+ -- a user "boolean" type
+ -- 
+ CREATE DOMAIN "boolean" AS 
+   TEXT DEFAULT 'yes' 
+   CHECK(VALUE = ANY (ARRAY['yes','no']));
+ 
+ CREATE TABLE calvin(i "boolean");
+ 
+ INSERT INTO calvin(i) VALUES(TRUE); -- fail
+ INSERT INTO calvin(i) VALUES(FALSE); -- fail
+ INSERT INTO calvin(i) VALUES('yes'); -- ok
+ INSERT INTO calvin(i) VALUES('no'); -- ok
+ 
+ SELECT COUNT(*)=2 FROM calvin; -- true
+ 
+ DROP TABLE calvin;
+ DROP DOMAIN "boolean";
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Reply via email to