[ 
https://issues.apache.org/jira/browse/DERBY-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dag H. Wanvik updated DERBY-4909:
---------------------------------

    Description: 
Presently, Derby rejects column references in a SELECT .. GROUP BY if the 
column reference isn't a grouping column.  This was correct SQL as per SQL 
1992, but later versions of the standard has loosened this to the present 
wording (e.g. SQL 2003, section 7.12 <query specification>, SR 15:

 "If T is a grouped table, then let G be the set of grouping columns of T. In 
each <value expression> contained in <select list> , each column reference that 
references a column of T shall reference some column C that is functionally 
dependent on G or shall be contained in an aggregated argument of a <set 
function specification> whose aggregation query is QS."

This can be useful in certain queries, cf. the example below culled from 
http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html

SELECT    f.film_id,
          f.title,
          COUNT(fa.actor_id)
FROM      film        f
LEFT JOIN film_actor  fa
ON        f.film_id = fa.film_id
GROUP BY  f.film_id;

In this case, f.title is functionally dependent on f.film_id (primary key), so 
the query is correct according to SQL 2003 and later.

Derby requires that f.title also be specified as a grouping column, which is 
more verbose, but can also have performance implications (although I didn't 
attempt to measure how this could impact Derby yet), at least if the functional 
dependency analysis is not performed to eliminate the extra grouping column. Do 
we do any such analysis?

Another example, a generated column would also be functionally dependent on the 
columns used to compute it, cf. section 4.8.3 "Known functional dependencies in 
a base table".


  was:
Presently, Derby rejects column references in a SELECT .. GROUP BY if the 
column reference isn't a grouping column.  This was correct SQL as per SQL 
1992, but later versions of the standard has loosened this to the present 
wording (e.g. SQL 2003, section 7.12 <query specification>, SR 15:

 "If T is a grouped table, then let G be the set of grouping columns of T. In 
each <value expression> contained in <select list> , each column reference that 
references a column of T shall reference some column C that is functionally 
dependent on G or shall be contained in an aggregated argument of a <set 
function specification> whose aggregation query is QS."

This can be useful in certain queries, cf. the example below culled from 
http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html

SELECT    f.film_id,
          f.title,
          COUNT(fa.actor_id)
FROM      film        f
LEFT JOIN film_actor  fa
ON        f.film_id = fa.film_id
GROUP BY  f.film_id;

In this case, f.title is functionally dependent on f.film_id (primary key), so 
the query is correct according to SQL 2003 and later.

Derby requires that f.title also be specified as a grouping column, which is 
more verbose, but can also have performance implications (although I didn't 
attempt to measure how this could impact Derby yet), at least if the functional 
dependency analysis is not performed to eliminate the extra grouping column. Do 
we do any such analysis?

Another example, a generated column would also be functionally dependent on the 
columns used to compute it.



> Allow SELECT list column references not mentioned in GROUP BY if CR is 
> functionally dependent
> ---------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4909
>                 URL: https://issues.apache.org/jira/browse/DERBY-4909
>             Project: Derby
>          Issue Type: Improvement
>            Reporter: Dag H. Wanvik
>
> Presently, Derby rejects column references in a SELECT .. GROUP BY if the 
> column reference isn't a grouping column.  This was correct SQL as per SQL 
> 1992, but later versions of the standard has loosened this to the present 
> wording (e.g. SQL 2003, section 7.12 <query specification>, SR 15:
>  "If T is a grouped table, then let G be the set of grouping columns of T. In 
> each <value expression> contained in <select list> , each column reference 
> that references a column of T shall reference some column C that is 
> functionally dependent on G or shall be contained in an aggregated argument 
> of a <set function specification> whose aggregation query is QS."
> This can be useful in certain queries, cf. the example below culled from 
> http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html
> SELECT    f.film_id,
>           f.title,
>           COUNT(fa.actor_id)
> FROM      film        f
> LEFT JOIN film_actor  fa
> ON        f.film_id = fa.film_id
> GROUP BY  f.film_id;
> In this case, f.title is functionally dependent on f.film_id (primary key), 
> so the query is correct according to SQL 2003 and later.
> Derby requires that f.title also be specified as a grouping column, which is 
> more verbose, but can also have performance implications (although I didn't 
> attempt to measure how this could impact Derby yet), at least if the 
> functional dependency analysis is not performed to eliminate the extra 
> grouping column. Do we do any such analysis?
> Another example, a generated column would also be functionally dependent on 
> the columns used to compute it, cf. section 4.8.3 "Known functional 
> dependencies in a base table".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to