I blogged about GROUP BY expression support across dbs a while back (http://tech.puredanger.com/2005/03/02/fun-with-expressions-in-a-group-b y/ and http://tech.puredanger.com/2005/04/01/update-on-expressions-in-group-by/ ). Pretty much all the major vendors support expressions in a GROUP BY. I implemented this functionality a while back for the MetaMatrix query engine and blogged about using MetaMatrix Query to "add" this functionality over Derby in http://devcentral.metamatrix.com/blog/alex/2006/02/28/Enhancing-Apache-D erby-with-MetaMatrix.
Positional parameters don't make as much sense for GROUP BY as they do for ORDER BY. Logically, you're executing the clauses in the order FROM-WHERE-GROUP BY-HAVING-SELECT-ORDER BY. The SELECT clause is the point at which the output columns of the query are effectively named and ordered. Positional parameters make sense in ORDER BY because they refer to the output columns of the previous phase, which are defined as part of the query, but may have no well-defined name to refer to them with. In the case of GROUP BY, you would be referring backwards from the SELECT clause to the GROUP BY clause, so that seems kind of goofy. The only reason to do this would be to avoid referencing a complex unnamed expression. This is, of course, exactly the sort of thing that db vendors bend the rules about to make SQL more usable. In fact, you can use GROUP BY positional parameters in MySQL and Postgres but not in any major commercial db that I've tried (Oracle, DB2, SQL Server, Sybase). Alex Miller Chief Architect MetaMatrix -----Original Message----- From: Satheesh Bandaram (JIRA) [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 17, 2006 1:14 PM To: [email protected] Subject: [jira] Commented: (DERBY-883) Enhance GROUP BY clause to support expressions instead of just column references. [ http://issues.apache.org/jira/browse/DERBY-883?page=comments#action_1241 2227 ] Satheesh Bandaram commented on DERBY-883: ----------------------------------------- Thanks for making progress on this important new functionality, Manish. > 1. In terms of syntax, do we allow expressions in the group by list or positional parameters, or both? > > select tomonth(creationdt), toyear(creationdt), count(*) from bugs > group by 1, 2; I have seen positional parameters for ORDER BY expressions, not typically used in GROUP BY. Looking at both DB2 and Oracle documentation, it seems neither support positional parameters. > An implementation question on this note-- does the language code have > a way of looking at two expressions (ValueNode?) and checking to see > if they are equivalent? We'll need some way of doing this to match an > expression in the group by list to an expression in the select list right? Correct. Don't think there is any existing expression matching to compare two expressions. DB2 docs discuss how group by expressions are matched in SQL reference manual. (Page 484: ftp://ftp.software.ibm.com/ps/products/db2/info/vr82/pdf/en_US/db2s1e81. pdf) > 2. I assume that an expression in a group by list must appear in the > select list without aggregation right? Is this an error? > > select x+1, x+2, sum(y) > from test > group by x NO... This is a valid query. See the reference I provided above. > 3. What do we do with duplicates? i.e. > > select x+1, x+1, sum(y) > from test > group by x+1, x+1; > > Is this an error? The current implementation throws an error if the > same column occurs more than once in the group by list. I am not sure why Derby currently considers this an error... Looking at the code, it seems it may be looking for ambiguous column references (like 'x' being part of two different tables in from_list), which makes sense, but not sure why duplicate references should be prevented. > Is there a standard somewhere which I should consult before trying to nail down the functionality? Unfortunately, NO.... SQL 2003 seems to allow only column references in GROUP BY clause. But both DB2 and Oracle allow expressions in GROUP BY list and likely allowed by other database vendors too. You could use either DB2 or Oracle docs to understand how this functionality is defined there. Much easier to read these docs than confusing SQL 2003 spec. > Enhance GROUP BY clause to support expressions instead of just column references. > ---------------------------------------------------------------------- > ----------- > > Key: DERBY-883 > URL: http://issues.apache.org/jira/browse/DERBY-883 > Project: Derby > Type: New Feature > Components: SQL > Versions: 10.1.2.1 > Environment: JDK 1.5.0_05 > Reporter: Lluis Turro > Assignee: Manish Khettry > > This select would return an error syntax on finding "(" after month if group by clause: > select idissue, month(creation), year(creation), count(distinct > idissue) where > .... > group by idissue, month(creation), year(creation) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
