At 11:41 PM -0800 1/14/08, Joe Wilson wrote:
In sqlite, assuming there's at least one row, an aggregate SELECT
with no GROUP BY clause is conceptually the same as an equivalent
SELECT with GROUP BY NULL - i.e., the group of all rows.
(I say 'conceptually' because GROUP BY NULL is much slower).

I actually thought of raising this issue too, but then thought it would complicate the discussion.

One could conceive a SQL SELECT, if it has no explicit GROUP BY but has explicit aggregate functions in the select list, as if it had an explicit GROUP BY but an empty column list, that is, a group per distinct source sub-rows of zero columns rather than per distinct source sub-rows of 1..M columns, and so a source rowset of 1..N rows would turn into a result rowset of exactly 1 row.

However, unless I'm mistaken about SQL behaviour, I see this analogy not holding true when there are zero source rows.

Normal SQL will return exactly 1 row when using aggregate functions and no GROUP BY clause, which is actually good when using things like COUNT or SUM.

However, any GROUP BY, whether over zero columns or 1..N columns, would return zero rows if there were zero input rows. That is the only way it can work if its behaviour is intended to be consistent.

Of course, that's not to say that there is any overall logical inconsistency, IF you consider that the native environment for aggregate functions is NOT with a GROUP BY.

So, use an aggregate on any rowset of 0..N rows, you get 1 row back.

If you conceive GROUP BY as actually just creating a table some of whose row field values are themselves tables (the columns being grouped by are outside of the inner tables, those not being grouped by are inside them), then using aggregate functions together with a GROUP BY is treating each inner table like the only table as far as the aggregates are concerned, and so applying the aggregates to inner tables to convert them to inner tables of one row each, then typically each of those is merged with its containing single outer row again.

On that note, a group-by of zero columns would then produce a table having a single row and single field whose value is the original table.

Now smarter relational DBMSs that support table-valued-fields could then let you use a GROUP BY in isolation, since if you keep any fields not being grouped by, they form rows of inner tables. Less capable DBMSs don't let you directly use the actual result of a relational group, and require you to do the additional step of either discarding non-grouped-by columns or using aggregates on them.

I don't know if SQL has provisions for a relational operator that results in the intermediate value I mentioned (table of tables), but even if it doesn't, a truly relational DBMS would have it.

-- Darren Duncan

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to