Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 For the moment, lie about WITH's status in the table so it will still get
 quoted --- this is because of the expectation that WITH will become reserved
 when the SQL recursive-queries patch gets done.

Out of curiosity I'm checking what consequences some other future grammer
changes might have for us. Today I checked out full spec compliant GROUP BY
syntax including ROLLUP, CUBE, and GROUPING SETS.

There are two conclusions of note:

1) ROLLUP and CUBE would have to be col_name_keyword keywords. 

   That could be an annoyance for the cube contrib package because it defines
   a few constructors like cube(float8[]). You could still have a type named
   cube but the functions would have to be renamed. Personally I always
   found cube a strange name anyways, I think of this data type more as a
   n-space vector than a cube anyways.

   quote_identifier would start quoting cube and rollup everywhere. My
   first inclination was that it's probably not necessary to start
   preemptively quoting them this release because people are more likely to
   use them as column names than function names anyways. But perhaps that's
   not true given the contrib module.

2) Assuming we keep our extension of allowing arbitrary expressions in GROUP
   BY lists then there is a conflict between our undecorated row constructor
   '(' expr_list ')' and the spec which allows parenthesized sublists in the
   grouping list.

   I'm not sure this is a real problem though. As near as I can tell the
   semantics of grouping by a ROW(a,b) and grouping by columns (a,b) as a
   grouping set element are basically the same anyways. So I think we can just
   accept any arbitrary expression including row constructors as what the spec
   calls an ordinary grouping set.




For what it's worth here's the grammar I get by basically copying the grammar
straight out of the spec and then cleaning up the conflicts including making
ordinary_grouping_set a straight expr_list as described above:

opt_group_set_clause:
DISTINCT
| ALL
| /*EMPTY*/
;

group_clause:
GROUP_P BY opt_group_set_clause grouping_element_list
| /*EMPTY*/
;

grouping_element_list:
grouping_element
| grouping_element_list ',' grouping_element
;

grouping_element:
a_expr
| rollup_list
| cube_list
| grouping_sets_specification
| empty_grouping_set
;

rollup_list:
ROLLUP '(' expr_list ')'
;

cube_list:
CUBE '(' expr_list ')'
;

grouping_sets_specification:
GROUPING SETS '(' grouping_set_list ')'
;

grouping_set_list:
grouping_set
| grouping_set_list ',' grouping_set
;

grouping_set:
a_expr
| rollup_list
| cube_list
| grouping_sets_specification
| empty_grouping_set
;

empty_grouping_set: '(' ')'
;

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:

 Tom Lane [EMAIL PROTECTED] writes:

 For the moment, lie about WITH's status in the table so it will still get
 quoted --- this is because of the expectation that WITH will become reserved
 when the SQL recursive-queries patch gets done.

 Out of curiosity I'm checking what consequences some other future grammer
 changes might have for us.

And checking OLAP window functions it at first glance at least it seems we
would have to make ROWS, WINDOW, RANGE, and PARTITION reserved keywords.

Again, I just sucked in the spec's grammar and fixed it up to fit in our
grammar. I did make PARTITION BY and ORDER BY take arbitrary expressions as is
our general approach.

It's possible there may be clever ways around these conflicts, in particular
it seems like it might be possible to make PARTITION not a keyword given the
nearby BY. However it appears in parentheses which looks like it makes things
a bit tricky.

Here's the relevant grammar snippet, and I attached the diff for gram.y to
parse both window clause and rollup/cube/grouping sets.

window_clause: 
WINDOW window_definition_list
| /* EMPTY */
;

window_definition_list:
window_definition
| window_definition ',' window_definition
;

window_definition: 
ColId AS window_specification
;

window_specification:
'(' window_specification_details ')'
;

window_specification_details:
opt_window_name
window_partition_clause
window_order_clause
window_frame_clause
;

opt_window_name: 
ColId 
| /*EMPTY*/ ;
;

window_partition_clause:
PARTITION BY expr_list
| /* EMPTY */
;

window_order_clause:
ORDER BY sortby_list
| /* EMPTY */
;

window_frame_clause:
window_frame_units window_frame_extent opt_window_frame_exclusion
| /* EMPTY */
;

window_frame_units:
ROWS
| RANGE
;

window_frame_extent:
window_frame_start
| window_frame_between
;

window_frame_start:
UNBOUNDED PRECEDING
| window_frame_preceding
| CURRENT_P ROW
;

window_frame_preceding: 
Iconst PRECEDING
;

window_frame_between: BETWEEN window_frame_bound AND window_frame_bound
;

window_frame_bound:
window_frame_start
| UNBOUNDED FOLLOWING
| window_frame_following
;

window_frame_following: 
Iconst FOLLOWING
;

opt_window_frame_exclusion:
/*EMPTY*/
| EXCLUDE CURRENT_P ROW
| EXCLUDE GROUP_P
| EXCLUDE TIES
| EXCLUDE NO OTHERS
;

Index: gram.y
===
RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.596
diff -u -r2.596 gram.y
--- gram.y	23 Jun 2007 22:12:51 -	2.596
+++ gram.y	26 Jun 2007 13:38:25 -
@@ -248,7 +248,7 @@
 TableFuncElementList opt_type_modifiers
 prep_type_clause
 execute_param_clause using_clause returning_clause
-enum_val_list
+enum_val_list 
 
 %type range	OptTempTableName
 %type into	into_clause create_as_target
@@ -298,7 +298,7 @@
 %type defelt	def_elem old_aggr_elem
 %type node	def_arg columnElem where_clause where_or_current_clause
 a_expr b_expr c_expr func_expr AexprConst indirection_el
-columnref in_expr having_clause func_table array_expr
+columnref in_expr having_clause func_table array_expr 
 %type list	row type_list array_expr_list
 %type node	case_expr case_arg when_clause case_default
 %type list	when_clause_list
@@ -451,6 +451,10 @@
 
 	ZONE
 
+	CUBE ROLLUP GROUPING SETS
+
+PRECEDING WINDOW FOLLOWING RANGE OTHERS UNBOUNDED TIES PARTITION EXCLUDE
+
 /* The grammar thinks these are keywords, but they are not in the keywords.c
  * list and so can never be entered directly.  The filter in parser.c
  * creates these tokens when required.
@@ -5989,7 +5993,7 @@
 simple_select:
 			SELECT opt_distinct target_list
 			into_clause from_clause where_clause
-			group_clause having_clause
+			group_clause having_clause window_clause
 {
 	SelectStmt *n = makeNode(SelectStmt);
 	n-distinctClause = $2;
@@ -6169,16 +6173,149 @@
 			a_expr	{ $$ = $1; }
 		;
 
+opt_group_set_clause:
+			DISTINCT
+			| ALL
+			| /*EMPTY*/
+		;
+
 group_clause:
-			GROUP_P BY expr_list	{ $$ = $3; }
+			GROUP_P BY opt_group_set_clause grouping_element_list { $$ = NIL; }
 			| /*EMPTY*/{ $$ = NIL; }
 		;
 
+grouping_element_list:
+	grouping_element
+	| grouping_element_list ',' grouping_element
+	;
+
+grouping_element:
+	a_expr
+	| rollup_list
+	| cube_list
+	| grouping_sets_specification
+	| empty_grouping_set
+	;
+
+rollup_list:
+	ROLLUP '(' expr_list ')'
+	;
+
+cube_list:
+	CUBE '(' expr_list ')'
+	;
+
+grouping_sets_specification:
+	GROUPING SETS '(' grouping_set_list ')'
+	;
+
+grouping_set_list:
+	

Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Gregory Stark
Gregory Stark [EMAIL PROTECTED] writes:

 Tom Lane [EMAIL PROTECTED] writes:

 For the moment, lie about WITH's status in the table so it will still get
 quoted --- this is because of the expectation that WITH will become reserved
 when the SQL recursive-queries patch gets done.

 Out of curiosity I'm checking what consequences some other future grammer
 changes might have for us. 

Oh, and I forgot OVER (assuming we want to actually use the windows, not just
define them). It looks like OVER and also PARTITION can get by as
type_func_name_keyword.

So that gives us:

col_name_keyword:
CUBE
ROLLUP

type_func_name_keyword:
OVER
PARTITION

reserved_keyword:
ROWS
WINDOW
RANGE

unreserved_keyword:
GROUPING
SETS
PRECEDING
FOLLOWING
OTHERS
UNBOUNDED
TIES
EXCLUDE


I'm thinking it may make sense to lie about all of these in quote_identifier
so that someone who upgrades from 8.2 to 8.3 can then upgrade to 8.4. If we
add reserved words in one step then there's no way to upgrade except to rename
things before dumping. Any comments?

Does anyone else have any pet features from the spec which introduce new
syntax they would like to see in Postgres soon?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Andrew Dunstan



Gregory Stark wrote:

I'm thinking it may make sense to lie about all of these in quote_identifier
so that someone who upgrades from 8.2 to 8.3 can then upgrade to 8.4. If we
add reserved words in one step then there's no way to upgrade except to rename
things before dumping. Any comments?


  


I'm confused. Won't pg_dump quote the keywords it knows its postgres 
version will need quoted? We can't expect it to have knowledge of future 
requirements for quoting, unless someone really does invent time travel 
(in which case someone could just go to the future and bring back 
version 107.6 and save us all the trouble).


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 I'm confused. Won't pg_dump quote the keywords it knows its postgres 
 version will need quoted? We can't expect it to have knowledge of future 
 requirements for quoting, unless someone really does invent time travel 
 (in which case someone could just go to the future and bring back 
 version 107.6 and save us all the trouble).

Yeah.  I'm disinclined to pre-emptively quote things for pie-in-the-sky
patches.  WITH is already a grammar keyword, so it's not a big deal to
tweak things to quote it, but adding a dozen keywords that have zero
functionality in the grammar is another thing entirely.

Also, the fact that this particular form of the grammar requires
reserving the keywords does not prove that there is no way to have the
features without that.  I'd want to see us try a little harder first.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Gregory Stark
Andrew Dunstan [EMAIL PROTECTED] writes:

 Gregory Stark wrote:
 I'm thinking it may make sense to lie about all of these in quote_identifier
 so that someone who upgrades from 8.2 to 8.3 can then upgrade to 8.4. If we
 add reserved words in one step then there's no way to upgrade except to 
 rename
 things before dumping. Any comments?

 I'm confused. Won't pg_dump quote the keywords it knows its postgres version
 will need quoted? We can't expect it to have knowledge of future requirements
 for quoting

Well as far as standard syntax from the spec we can. But you're right, as
Heikki pointed out offline, the approved upgrade path is to use the pg_dump
from the target version to dump the old database.

It may still be worth telling one version of Postgres about anticipated
keywords prior to a release which adds them so that people who don't follow
instructions and try a dump generated with an old pg_dump have a fighting
chance. Besides, what's a person to do if they have a dump lying around?
Restore it in an old database so they can re-dump it with the new pg_dump?

That said, this would only bite someone who is using columns named cube or
rows *and* uses the old pg_dump or tries to restore a dump they already
have without re-dumping with the new pg_dump. It may just not be worth
worrying about.

  unless someone really does invent time travel (in which case someone could
 just go to the future and bring back version 107.6 and save us all the
 trouble).

That would revolutionize the IP law field

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Andrew Dunstan



Gregory Stark wrote:

It may still be worth telling one version of Postgres about anticipated
keywords prior to a release which adds them so that people who don't follow
instructions and try a dump generated with an old pg_dump have a fighting
chance. Besides, what's a person to do if they have a dump lying around?
Restore it in an old database so they can re-dump it with the new pg_dump?

  



That is in fact the only thing we fairly much guarantee to work.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Yeah.  I'm disinclined to pre-emptively quote things for pie-in-the-sky
 patches.  WITH is already a grammar keyword, so it's not a big deal to
 tweak things to quote it, but adding a dozen keywords that have zero
 functionality in the grammar is another thing entirely.

Sure.

 Also, the fact that this particular form of the grammar requires
 reserving the keywords does not prove that there is no way to have the
 features without that.  I'd want to see us try a little harder first.

At least some of them are unavoidable conflicts:

select a,b,count(*) from tab GROUP BY cube(a,b)
select a,b,count(*) from tab GROUP BY rollup(a,b)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [COMMITTERS] pgsql: Arrange for quote_identifier() and pg_dump to not quote keywords

2007-06-26 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Also, the fact that this particular form of the grammar requires
 reserving the keywords does not prove that there is no way to have the
 features without that.  I'd want to see us try a little harder first.

 At least some of them are unavoidable conflicts:

 select a,b,count(*) from tab GROUP BY cube(a,b)
 select a,b,count(*) from tab GROUP BY rollup(a,b)

In principle this case could be handled by having parse analysis
treat a FuncCall node at the top level of GROUP BY specially when
the function name is CUBE etc.

I'm not saying that might not be uglier than having the grammar
recognize it, just pointing out that there's usually more than
one way to skin a cat.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate