This patch makes CREATE TABLE AS conform more closely to SQL:2003 by
adding support for the WITH [ NO ] DATA clause (per section 11.3). The
standard says that this clause is mandatory, but I think it should be
optional in PG (partly for backward compatibility, and partly because I
think that is sane behavior). I also added regression tests and updated
the docs. Gavin Sherry helped me solve a parser problem (elaborated on
below) -- much thanks, Gavin.

This is 8.1 material so this patch won't be applied any time soon, but I
thought I'd post it here to get some feedback. Issues/notes on the
patch:

(1) The standard specifies that WITH [ NO ] DATA should follow the
subquery in CREATE TABLE AS. This causes problems for bison: if we make
the modifications to the grammar naively, it results in about 16
shift/reduce conflicts because the "WITH" keyword can appear in various
places in SelectStmt (e.g. WITH TIME ZONE). Gavin fixed that by doing
some manual lookahead in parser.c, like was previously being done for
UNION JOIN. Is this the best solution?

(2) I've modified the parser to transform the SELECT query to have a
"LIMIT 0" clause if WITH NO DATA is specified. This is ugly and
inefficient, but it was the cleanest way I could see to implement it --
we represent CREATE TABLE AS internally as a SelectStmt, and it seemed
cleaner to do this than to add more fields to SelectStmt. Is there a
cleaner way to do this? Perhaps the better solution is to make CREATE
TABLE AS its own Node (as I think has been raised in the past).

(3) A related question is: if the subquery's SelectStmt already has a
LIMIT, should I pfree() the limitCount field before overwriting it? My
guess was that parse trees are sometimes constructed in sufficiently
long-lived memory contexts that the pfree() is wise, but I'm not sure if
that's correct.

(Speaking of which, do we have any docs about when it's necessary to
pfree() each allocation and when it's not? IMHO this can sometimes be
difficult to know...)

(4) I haven't implemented support for WITH [ NO ] DATA in CREATE TABLE
AS / EXECUTE (which is depressingly implemented completely separately
from CREATE TABLE AS / SELECT). I think if we restructure CREATE TABLE
AS to be its own Node, we can clean this up as well.

(5) I called the DATA keyword "DATA_P" because I guessed Win32 might
have issues with it otherwise (see OBJECT_P and so on). I'll double
check whether that's necessary before committing anything -- if anyone
happens to know, please speak up.

Comments?

-Neil

P.S. I'm beginning to think that rather than applying this patch as-is
when we branch for 8.1, it might be a better idea to just bite the
bullet and restructure CREATE TABLE AS as suggested above. Thoughts?
Index: doc/src/sgml/ref/create_table_as.sgml
===================================================================
RCS file: /home/neilc/private-cvsroot/pgsql-server/doc/src/sgml/ref/create_table_as.sgml,v
retrieving revision 1.23
diff -c -r1.23 create_table_as.sgml
*** doc/src/sgml/ref/create_table_as.sgml	24 Aug 2004 00:06:51 -0000	1.23
--- doc/src/sgml/ref/create_table_as.sgml	22 Sep 2004 08:41:44 -0000
***************
*** 21,27 ****
   <refsynopsisdiv>
  <synopsis>
  CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } ] TABLE <replaceable>table_name</replaceable> [ (<replaceable>column_name</replaceable> [, ...] ) ] [ [ WITH | WITHOUT ] OIDS ]
!     AS <replaceable>query</replaceable>
  </synopsis>
   </refsynopsisdiv>
    
--- 21,27 ----
   <refsynopsisdiv>
  <synopsis>
  CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } ] TABLE <replaceable>table_name</replaceable> [ (<replaceable>column_name</replaceable> [, ...] ) ] [ [ WITH | WITHOUT ] OIDS ]
!     AS <replaceable>query</replaceable> [ WITH [ NO ] DATA ]
  </synopsis>
   </refsynopsisdiv>
    
***************
*** 29,37 ****
    <title>Description</title>
  
    <para>
!    <command>CREATE TABLE AS</command> creates a table and fills it
!    with data computed by a <command>SELECT</command> command or an
!    <command>EXECUTE</command> that runs a prepared
     <command>SELECT</command> command.  The table columns have the
     names and data types associated with the output columns of the
     <command>SELECT</command> (except that you can override the column
--- 29,37 ----
    <title>Description</title>
  
    <para>
!    <command>CREATE TABLE AS</command> creates a table and optionally
!    fills it with data computed by a <command>SELECT</command> command
!    or an <command>EXECUTE</command> that runs a prepared
     <command>SELECT</command> command.  The table columns have the
     names and data types associated with the output columns of the
     <command>SELECT</command> (except that you can override the column
***************
*** 125,130 ****
--- 125,145 ----
       </para>
      </listitem>
     </varlistentry>
+ 
+    <varlistentry>
+     <term><literal>WITH DATA</literal></term>
+     <term><literal>WITH NO DATA</literal></term>
+      <listitem>
+       <para>
+        This optional clause specifies whether the table created by
+        <command>CREATE TABLE AS</command> should contain any data. If
+        <literal>WITH NO DATA</literal> is specified, the schema of the
+        new table is created, but no rows are inserted into it. If this
+        clause is not specified, <literal>WITH DATA</literal> is the
+        default.
+       </para>
+      </listitem>
+    </varlistentry>
    </variablelist>
   </refsect1>
  
***************
*** 166,176 ****
    <title>Compatibility</title>
  
    <para>
!    This command is modeled after an <productname>Oracle</productname>
!    feature.  There is no command with equivalent functionality in
!    the SQL standard.  However, a combination of <literal>CREATE
!    TABLE</literal> and <literal>INSERT ... SELECT</literal> can
!    accomplish the same thing with little more effort.
    </para>
   </refsect1>
  
--- 181,214 ----
    <title>Compatibility</title>
  
    <para>
!    <command>CREATE TABLE AS</command> is specified by the SQL2003
!    standard. There are some small differences between the definition
!    of the command in SQL2003 and its implementation in
!    <productname>PostgreSQL</>:
! 
!    <itemizedlist spacing="compact">
!     <listitem>
!      <para>
!       The <literal>WITH DATA</literal> clause is mandatory in the
!       standard, whereas it is optional in <productname>PostgreSQL</>.
!      </para>
!     </listitem>
! 
!     <listitem>
!      <para>
!       The standard requires parentheses around the subquery clause; in
!       <productname>PostgreSQL</productname>, these parentheses are
!       optional.
!      </para>
!     </listitem>
! 
!     <listitem>
!      <para>
!       The standard defines an <literal>ON COMMIT</literal> clause;
!       this is not currently implemented by <productname>PostgreSQL</>.
!      </para>
!     </listitem>
!    </itemizedlist>
    </para>
   </refsect1>
  
Index: src/backend/parser/gram.y
===================================================================
RCS file: /home/neilc/private-cvsroot/pgsql-server/src/backend/parser/gram.y,v
retrieving revision 2.475
diff -c -r2.475 gram.y
*** src/backend/parser/gram.y	29 Aug 2004 04:12:35 -0000	2.475
--- src/backend/parser/gram.y	22 Sep 2004 08:41:48 -0000
***************
*** 321,327 ****
  %type <list>	constraints_set_list
  %type <boolean> constraints_set_mode
  %type <str>		OptTableSpace OptConsTableSpace OptTableSpaceOwner
! 
  
  /*
   * If you make any token changes, update the keyword table in
--- 321,327 ----
  %type <list>	constraints_set_list
  %type <boolean> constraints_set_mode
  %type <str>		OptTableSpace OptConsTableSpace OptTableSpaceOwner
! %type <boolean> OptWithData
  
  /*
   * If you make any token changes, update the keyword table in
***************
*** 344,350 ****
  	CREATEUSER CROSS CSV CURRENT_DATE CURRENT_TIME
  	CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
  
! 	DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
  	DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS
  	DESC DISTINCT DO DOMAIN_P DOUBLE_P DROP
  
--- 344,350 ----
  	CREATEUSER CROSS CSV CURRENT_DATE CURRENT_TIME
  	CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
  
! 	DATA_P DATABASE DAY_P DEALLOCATE DEC DECIMAL_P DECLARE DEFAULT DEFAULTS
  	DEFERRABLE DEFERRED DEFINER DELETE_P DELIMITER DELIMITERS
  	DESC DISTINCT DO DOMAIN_P DOUBLE_P DROP
  
***************
*** 415,421 ****
   * list and so can never be entered directly.  The filter in parser.c
   * creates these tokens when required.
   */
! %token			UNIONJOIN
  
  /* Special token types, not actually keywords - see the "lex" file */
  %token <str>	IDENT FCONST SCONST BCONST XCONST Op
--- 415,421 ----
   * list and so can never be entered directly.  The filter in parser.c
   * creates these tokens when required.
   */
! %token			UNIONJOIN WITH_DATA WITH_NO
  
  /* Special token types, not actually keywords - see the "lex" file */
  %token <str>	IDENT FCONST SCONST BCONST XCONST Op
***************
*** 1961,1967 ****
   */
  
  CreateAsStmt:
! 			CREATE OptTemp TABLE qualified_name OptCreateAs WithOidsAs SelectStmt
  				{
  					/*
  					 * When the SelectStmt is a set-operation tree, we must
--- 1961,1967 ----
   */
  
  CreateAsStmt:
! 			CREATE OptTemp TABLE qualified_name OptCreateAs WithOidsAs SelectStmt OptWithData
  				{
  					/*
  					 * When the SelectStmt is a set-operation tree, we must
***************
*** 1979,1984 ****
--- 1979,1998 ----
  					n->into = $4;
  					n->intoColNames = $5;
  					n->intoHasOids = $6;
+ 					/*
+ 					 * If "WITH NO DATA" was specified, just create
+ 					 * the table's schema. The kludge we use to
+ 					 * achieve this is to add a "LIMIT 0" clause to
+ 					 * the query.
+ 					 */
+ 					if ($8 == FALSE)
+ 					{
+ 						SelectStmt *stmt = (SelectStmt *) $7;
+ 						if (stmt->limitCount)
+ 							pfree(stmt->limitCount);
+ 						stmt->limitCount = makeIntConst(0);
+ 					}
+ 
  					$$ = $7;
  				}
  		;
***************
*** 1995,2000 ****
--- 2009,2025 ----
  			| AS 									{ $$ = DEFAULT_OIDS; }
  			;
  
+ /*
+  * In order to avoid shift/reduce conflicts, we need to play some
+  * games in the lexer (see parser.c) to merge "WITH DATA" and "WITH
+  * NO" into single tokens.
+  */
+ OptWithData:
+ 			WITH_DATA								{ $$ = TRUE; }
+ 			| WITH_NO DATA_P						{ $$ = FALSE; }
+ 			| /* EMPTY */							{ $$ = TRUE; }
+ 			;
+ 
  OptCreateAs:
  			'(' CreateAsList ')'					{ $$ = $2; }
  			| /*EMPTY*/								{ $$ = NIL; }
***************
*** 7681,7686 ****
--- 7706,7712 ----
  			| CSV
  			| CURSOR
  			| CYCLE
+ 			| DATA_P
  			| DATABASE
  			| DAY_P
  			| DEALLOCATE
Index: src/backend/parser/keywords.c
===================================================================
RCS file: /home/neilc/private-cvsroot/pgsql-server/src/backend/parser/keywords.c,v
retrieving revision 1.153
diff -c -r1.153 keywords.c
*** src/backend/parser/keywords.c	29 Aug 2004 04:12:40 -0000	1.153
--- src/backend/parser/keywords.c	22 Sep 2004 08:41:48 -0000
***************
*** 97,102 ****
--- 97,103 ----
  	{"current_user", CURRENT_USER},
  	{"cursor", CURSOR},
  	{"cycle", CYCLE},
+ 	{"data", DATA_P},
  	{"database", DATABASE},
  	{"day", DAY_P},
  	{"deallocate", DEALLOCATE},
Index: src/backend/parser/parser.c
===================================================================
RCS file: /home/neilc/private-cvsroot/pgsql-server/src/backend/parser/parser.c,v
retrieving revision 1.62
diff -c -r1.62 parser.c
*** src/backend/parser/parser.c	29 Aug 2004 04:12:42 -0000	1.62
--- src/backend/parser/parser.c	22 Sep 2004 08:41:48 -0000
***************
*** 98,104 ****
  			else
  				have_lookahead = true;
  			break;
! 
  		default:
  			break;
  	}
--- 98,112 ----
  			else
  				have_lookahead = true;
  			break;
! 		case WITH:
! 			/* Look ahead to see if we are doing WITH DATA or WITH NO DATA */
! 			lookahead_token = base_yylex();
! 			if(lookahead_token == DATA_P)
! 				cur_token = WITH_DATA;
! 			else if(lookahead_token == NO)
! 				cur_token = WITH_NO;
! 			else
! 				have_lookahead = true;
  		default:
  			break;
  	}
Index: src/test/regress/expected/select_into.out
===================================================================
RCS file: /home/neilc/private-cvsroot/pgsql-server/src/test/regress/expected/select_into.out,v
retrieving revision 1.5
diff -c -r1.5 select_into.out
*** src/test/regress/expected/select_into.out	4 Jun 2000 17:52:53 -0000	1.5
--- src/test/regress/expected/select_into.out	22 Sep 2004 08:41:49 -0000
***************
*** 11,13 ****
--- 11,45 ----
     FROM onek2
     WHERE onek2.unique1 < 2;
  DROP TABLE tmp1;
+ --
+ -- CREATE TABLE AS (tested here since it is closely related
+ -- to SELECT INTO)
+ --
+ CREATE TABLE with_data AS SELECT * FROM onek WITH DATA;
+ CREATE TABLE with_no_data AS SELECT * FROM onek WITH NO DATA;
+ SELECT COUNT(*) FROM with_data;
+  count 
+ -------
+   1000
+ (1 row)
+ 
+ SELECT COUNT(*) FROM with_no_data;
+  count 
+ -------
+      0
+ (1 row)
+ 
+ DROP TABLE with_data;
+ DROP TABLE with_no_data;
+ CREATE TABLE with_no_data AS
+     SELECT * FROM onek UNION
+     SELECT * FROM onek2
+     ORDER BY 1 LIMIT 100
+     WITH NO DATA;
+ SELECT COUNT(*) FROM with_no_data;
+  count 
+ -------
+      0
+ (1 row)
+ 
+ DROP TABLE with_no_data;
Index: src/test/regress/sql/select_into.sql
===================================================================
RCS file: /home/neilc/private-cvsroot/pgsql-server/src/test/regress/sql/select_into.sql,v
retrieving revision 1.4
diff -c -r1.4 select_into.sql
*** src/test/regress/sql/select_into.sql	4 Jun 2000 17:52:54 -0000	1.4
--- src/test/regress/sql/select_into.sql	22 Sep 2004 08:41:49 -0000
***************
*** 16,18 ****
--- 16,39 ----
  
  DROP TABLE tmp1;
  
+ --
+ -- CREATE TABLE AS (tested here since it is closely related
+ -- to SELECT INTO)
+ --
+ CREATE TABLE with_data AS SELECT * FROM onek WITH DATA;
+ CREATE TABLE with_no_data AS SELECT * FROM onek WITH NO DATA;
+ 
+ SELECT COUNT(*) FROM with_data;
+ SELECT COUNT(*) FROM with_no_data;
+ 
+ DROP TABLE with_data;
+ DROP TABLE with_no_data;
+ 
+ CREATE TABLE with_no_data AS
+     SELECT * FROM onek UNION
+     SELECT * FROM onek2
+     ORDER BY 1 LIMIT 100
+     WITH NO DATA;
+ 
+ SELECT COUNT(*) FROM with_no_data;
+ DROP TABLE with_no_data;
\ No newline at end of file
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to