[
https://issues.apache.org/jira/browse/DERBY-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dag H. Wanvik updated DERBY-4909:
---------------------------------
Description:
Presently, Derby rejects column references in a SELECT .. GROUP BY if the
column reference isn't a grouping column. This was correct SQL as per SQL
1992, but later versions of the standard has loosened this to the present
wording (e.g. SQL 2003, section 7.12 <query specification>, SR 15:
"If T is a grouped table, then let G be the set of grouping columns of T. In
each <value expression> contained in <select list> , each column reference that
references a column of T shall reference some column C that is functionally
dependent on G or shall be contained in an aggregated argument of a <set
function specification> whose aggregation query is QS."
This can be useful in certain queries, cf. the example below culled from
http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html
SELECT f.film_id,
f.title,
COUNT(fa.actor_id)
FROM film f
LEFT JOIN film_actor fa
ON f.film_id = fa.film_id
GROUP BY f.film_id;
In this case, f.title is functionally dependent on f.film_id (primary key), so
the query is correct according to SQL 2003 and later.
Derby requires that f.title also be specified as a grouping column, which is
more verbose, but can also have performance implications (although I didn't
attempt to measure how this could impact Derby yet), at least if the functional
dependency analysis is not performed to eliminate the extra grouping column. Do
we do any such analysis?
Another example, a generated column would also be functionally dependent on the
columns used to compute it, cf. section 4.8.3 "Known functional dependencies in
a base table".
was:
Presently, Derby rejects column references in a SELECT .. GROUP BY if the
column reference isn't a grouping column. This was correct SQL as per SQL
1992, but later versions of the standard has loosened this to the present
wording (e.g. SQL 2003, section 7.12 <query specification>, SR 15:
"If T is a grouped table, then let G be the set of grouping columns of T. In
each <value expression> contained in <select list> , each column reference that
references a column of T shall reference some column C that is functionally
dependent on G or shall be contained in an aggregated argument of a <set
function specification> whose aggregation query is QS."
This can be useful in certain queries, cf. the example below culled from
http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html
SELECT f.film_id,
f.title,
COUNT(fa.actor_id)
FROM film f
LEFT JOIN film_actor fa
ON f.film_id = fa.film_id
GROUP BY f.film_id;
In this case, f.title is functionally dependent on f.film_id (primary key), so
the query is correct according to SQL 2003 and later.
Derby requires that f.title also be specified as a grouping column, which is
more verbose, but can also have performance implications (although I didn't
attempt to measure how this could impact Derby yet), at least if the functional
dependency analysis is not performed to eliminate the extra grouping column. Do
we do any such analysis?
Another example, a generated column would also be functionally dependent on the
columns used to compute it.
> Allow SELECT list column references not mentioned in GROUP BY if CR is
> functionally dependent
> ---------------------------------------------------------------------------------------------
>
> Key: DERBY-4909
> URL: https://issues.apache.org/jira/browse/DERBY-4909
> Project: Derby
> Issue Type: Improvement
> Reporter: Dag H. Wanvik
>
> Presently, Derby rejects column references in a SELECT .. GROUP BY if the
> column reference isn't a grouping column. This was correct SQL as per SQL
> 1992, but later versions of the standard has loosened this to the present
> wording (e.g. SQL 2003, section 7.12 <query specification>, SR 15:
> "If T is a grouped table, then let G be the set of grouping columns of T. In
> each <value expression> contained in <select list> , each column reference
> that references a column of T shall reference some column C that is
> functionally dependent on G or shall be contained in an aggregated argument
> of a <set function specification> whose aggregation query is QS."
> This can be useful in certain queries, cf. the example below culled from
> http://rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html
> SELECT f.film_id,
> f.title,
> COUNT(fa.actor_id)
> FROM film f
> LEFT JOIN film_actor fa
> ON f.film_id = fa.film_id
> GROUP BY f.film_id;
> In this case, f.title is functionally dependent on f.film_id (primary key),
> so the query is correct according to SQL 2003 and later.
> Derby requires that f.title also be specified as a grouping column, which is
> more verbose, but can also have performance implications (although I didn't
> attempt to measure how this could impact Derby yet), at least if the
> functional dependency analysis is not performed to eliminate the extra
> grouping column. Do we do any such analysis?
> Another example, a generated column would also be functionally dependent on
> the columns used to compute it, cf. section 4.8.3 "Known functional
> dependencies in a base table".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.