Re: [HACKERS] SQL feature requests

Michael Glaesemann Wed, 22 Aug 2007 20:55:49 -0700


On Aug 22, 2007, at 20:36 , Ben Tilly wrote:

On 8/22/07, Michael Glaesemann <[EMAIL PROTECTED]> wrote:


On Aug 22, 2007, at 18:45 , Ben Tilly wrote:

1. Just a minor annoyance, but why must subqueries in FROM clauses
have an alias?


It's required by the SQL standard, AIUI. I wonder what EnterpriseDB
does?


I can well believe that the standard says that you must accept
subqueries with aliases.  But does it say you must reject subqueries
without aliases?  I strongly doubt that.

If I'm reading my draft copy of the SQL:2003 spec right (and there'sa good chance that I'm not, as it's not the easiest document for meto parse), aliases *are* required.


From 5WD-02-Foundation-2003-09 (the fifth working draft)

7.5 <from clause>

<from clause> ::= FROM <table reference list>
<table reference list> ::=
    <table reference> [ { <comma> <table reference> }... ]

7.6 <table reference>

<table reference> ::=
    <table factor>
  | <joined table>

<table factor> ::= <table primary> [ <sample clause> ]

<table primary> ::=
    <table or query name> [ [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ] ]
  | <derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
  | <lateral derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
  | <collection derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
  | <table function derived table> [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ]
  | <only spec> [ [ AS ] <correlation name>
    [ <left paren> <derived column list><right paren> ] ]
  | <left paren> <joined table> <right paren>

<derived table> ::= <table subquery>

<correlation name> I believe is what we commonly refer to as analias. I think <table or query name> is a table or view name, and<derived table> is a subquery. <correlation name> is optional for<table or query name> but not for <derived table>. The fact that thestandard clearly makes it optional in some cases and required inothers is pretty clear that <derived table> without a <correlationname> is to be rejected, don't you think?

I have no clue what EnterpriseDB does.

In case it wasn't clear, the reason I bring it up is thatEnterpriseDB, while working from a PostgreSQL base, strives forOracle compatibility.

that case, couldn't you use whatever method works in PostgreSQL in
Oracle? I haven't checked the SQL standard, but it seems unlikely
it'd allow something like

GROUP BY , , , ;

That's not what Oracle accepts that postgres does not. WhatOracle accepts is:


  ...
  GROUP BY 'foo';

Thanks for clarifying. If it wasn't obvious by now, you know I'm notfamiliar with Oracle :)

AIUI, Integers are only allowed because the SQL standard explicitly
allows you to refer to columns by the order they appear in the SELECT
list. Otherwise the GROUP BY items need to be column names.


Need to be?

The SQL-92 standard is clear that you must accept a list of column
names.  It is also clear that a column name must be be of the form
field or table.field.


The 2003 draft (same as above) seems to agree with the SQL92 standard:

7.9 <group by clause>

Format
<group by clause> ::=
    GROUPBY [ <set quantifier> ] <grouping element list>
<grouping element list> ::=
    <grouping element> [ { <comma> <grouping element> }... ]
<grouping element> ::=
    <ordinary grouping set>
  | <rollup list>
  | <cube list>
  | <grouping sets specification>
  | <empty grouping set>
<ordinary grouping set> ::=
    <grouping column reference>
  | <left paren> <grouping column reference list><right paren>
<grouping column reference> ::=
    <column reference> [ <collate clause> ]
<grouping column reference list> ::=

<grouping column reference> [ { <comma><grouping columnreference> }... ]

<rollup list> ::=
    ROLLUP<left paren> <ordinary grouping set list><right paren>
<ordinary grouping set list> ::=
    <ordinary grouping set> [ { <comma> <ordinary grouping set> }... ]
<cube list> ::=
    CUBE<left paren> <ordinary grouping set list><right paren>
<grouping sets specification> ::=
    GROUPINGSETS <left paren> <grouping set list><right paren>
<grouping set list> ::=
    <grouping set> [ { <comma> <grouping set> }... ]
<grouping set> ::=
    <ordinary grouping set>
  | <rollup list>
  | <cube list>
  | <grouping sets specification>
  | <empty grouping set>

<empty grouping set> ::= <left paren><right paren>

6.7 <column reference>

Format
<column reference> ::=
    <basic identifier chain>
  | MODULE<period> <qualified identifier><period> <column name>

There'd have to be a pretty strong reason to extend this, more thanjust a convenience, I should think.

  In no way, shape or form does that allow having
terms like trim(foo.bar) in a group by.
But every reasonable database that I know - including postgres -allows that.

Can you give an example of something like this working in PostgreSQL?I get an error when I try to use a text value in a GROUP BY clause.(Or are you referring specifically to the CASE expression corner case?)

test=# select record_id as foo, count(observation_id) as bar fromobservation group by trim(observation.record_id::text);ERROR: column "observation.record_id" must appear in the GROUP BYclause or be used in an aggregate functiontest=# select record_id as foo, count(observation_id) as bar fromobservation group by trim(observation.record_id);ERROR: column "observation.record_id" must appear in the GROUP BYclause or be used in an aggregate function

Postgres explicitly disallows a constant character expression.  But it
allows the constant case expression that I gave.  It would be nice for
me to not have to remember that very obscure and convoluted case.

I agree, and would move that it should be disallowed if there isn't areason for it to be maintained, for exactly the reason you give:there shouldn't be such convoluted, and obscure corner case.

test=# select record_id as foo, count(observation_id) as bar fromobservation group by record_id, true;

ERROR:  non-integer constant in GROUP BY

test=# select record_id as foo, count(observation_id) as bar fromobservation group by record_id, case when true then true end;

What I'm trying to say is that it would be convenient for me to be
able to write:

  select bar as "baz"
    , count(*) as "some count"
  from foo
  group by bar as "baz"
That's not allowed right now because as is not allowed in a groupby statement.


Ah, I see. Thanks for clarifying.

Which is to assume that
a query without a group by clause, but with an aggregate function in

the select, should have an implicit group by clause where yougroup by

all non-aggregate functions in the select.

For example

  SELECT foo, count(*)
  FROM bar

would be processed as:

  SELECT foo, count(*)
  FROM bar
  GROUP BY foo


It's been discussed before. I don't believe it's been rejected out of
hand (though you can check the archives), just that no one's gotten
around to it. (Don't know what the SQL-spec says on this point.)


I don't know what the SQL spec says, but I know (having talked to
other developers) that many people would find it very nice.

Since I had the spec open, I tried to look at this as well, though Imust admit I found it very tough going.


I think this is the key section:

10.9 <aggregate function>

...

Syntax Rules

...

4) The argument source of an <aggregate function> is
Case:
a) If AF is immediately contained in a <set functionspecification>, then a table or group of a grouped table asspecified in Subclause 7.10, “<having clause>”, and Subclause 7.12,“<query specification>”.
b) Otherwise, the collection of rows in the current row's windowframe defined by the window structure descriptor identified by the<window function> that simply contains AF, as defined in Subclause7.11, “<window clause>”.

Now the <query specification> stuff is pretty dense, and the <windowclause> is currently impenetrable for me, so I just looked at the<having clause> and <query specification> sections referenced above.

7.10 <having clause>

...

Syntax Rules
1) Let HC be the <having clause>. Let TE be the <table expression>that immediately contains HC. If TE does not immediately contain a<group by clause>, then “GROUPBY ()” is implicit. Let T be thedescriptor of the table defined by the <group by clause>GBCimmediately contained in TE and let R be the result of GBC.

So it looks like if there's no explicit GROUP BY, we're to assumeGROUP BY (), at least in the case of HAVING.

7.12 <query specification>

...

Syntax Rules

...
14) If <table expression> does not immediately contain a <group byclause> and <table expression> is simply contained in a <queryexpression> that is the aggregation query of some <set functionspecification>, then GROUP BY () is implicit.

Again, assume GROUP BY () if no explicit GROUP BY with a <setfunction specification>, which I understand to be some form ofaggregation.

15) If T is a grouped table, then let G be the set of groupingcolumns of T. In each <value expression> contained in <selectlist> , each column reference that references a column of T shallreference some column C that is functionally dependent on G orshall be contained in an aggregated argument of a <set functionspecification> whose aggregation query is QS.


And a definition of grouped table is found:

4.14.2 Types of tables

...
A grouped table is a set of groups derived during the evaluation ofa <group by clause>. A group G is a collection of rows in which,for every grouping column GC, if the value of GC in some row is notdistinct from GV, then the value of GC in every row is GV;moreover, if R1 is a row in group G1 of grouped table GT and R2 isa row in GT such that for every grouping column GC the value of GCin R1 is not distinct from the value of GC in R2, then R2 is in G1.Every row in GT is in exactly one group. A group may be consideredas a table. Set functions operate on groups.

So we're already talking about an explicit GROUP BY clause, and itdoesn't seem to leave much wiggle room grouping columns that aren'texplicitly listed. I'll readily admit I very easily could be readingthis wrong or have missed a section of the spec that discusses thisin more detail. (And that goes for any of my interpretations of thespec!)

One place you and I differ is that I think you'd be in favor of manymore extensions in cases where the SQL spec doesn't explicitly say"Thou shalt not do X". I'm a bit more conservative here: in myopinion, if the SQL speaks to a particular issue (for example, aWHERE clause), then you should pretty much keep to the letter of whatit says. Extensions are more for areas where the standard is silent.For example, full text search will be added to PostgreSQL in 8.3.AFAIK, the SQL spec doesn't talk about full text search at all, sothis is an area where extensions are a natural fit. Others, such asPostgreSQL's DISTINCT ON, is a much more difficult issue, and I'dsuspect if it were proposed as a new feature now it would berejected. However, it's been a part of PostgreSQL for a long time, sofor historical reasons it remains. Likewise, I doubt "text" as asynonym for "varchar" would be accepted.

Anyway, enough excursions into SpecLand for me. I'd welcome anyoneelse taking a crack at it.


Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: [HACKERS] SQL feature requests

Reply via email to