On 2016/06/20 5:37 PM, Harmen de Jong - CoachR Group B.V. wrote:
I have a query with a subquery (using an aggregate function) in the SELECT 
statement. Will this subquery be executed before or after the GROUP BY?
For example:
SELECT a.id, (SELECT max(orderno) FROM b WHERE b.c_id=c.id) as maxorderno
FROM a INNER JOIN c on c.a_id=a.id
GROUP BY a.id

This is a tricky one to answer due to a peculiarity in SQLite in that it allows you to not put all aggregates in an aggregated query (GROUP BY) in aggregate functions. The SQL standard prescribes different and you will find some other DB engines adhere to the rule where every field/value NOT explicitly in the GROUP BY clause MUST be contained in its own aggregate function.

The short version is: For your query to be correct semantically and in result, it has to specify both MAX() for the inner query and the maxorderno resulting value, or join the b table too.

The long version with example data and queries:

In this next script, 3 tables are made with schemata that will resemble in principle what you might typically have in your tables (or at least what would make a query like that a sensible endeavour). Note that only the very last 2 queries are fully correct versions of your query, even though some others yield the correct results, some of them have results that show why it is dangerous to not follow the every-value-in-its-own-aggregate rule. I hope it's self-explanatory, but please ask if anything is unclear or you have any other questions.


-- SQLite version 3.9.2 [ Release: 2015-11-02 ] on SQLitespeed version 2.0.2.4.

  -- Script Items: 10         Parameter Count: 0
-- 2016-06-20 21:08:42.768 | [Info] Script Initialized, Started executing... -- ================================================================================================

CREATE TABLE a(
  id INTEGER PRIMARY KEY,
  supplier TEXT
);

CREATE TABLE b(
  id INTEGER PRIMARY KEY,
  c_id INT,
  orderno TEXT
);

CREATE TABLE c(
  id INTEGER PRIMARY KEY,
  a_id INT,
  payref TEXT
);

INSERT INTO a VALUES
(1, 'Wallcart'),
(2, 'Pro-Shop'),
(3, 'Pizza-for-All');

INSERT INTO b VALUES
(1, 1, 'ORD001'), (2, 1, 'ORD002'), (3, 1, 'ORD003'),
(4, 2, 'ORD004'), (5, 2, 'ORD005'),
(6, 3, 'ORD006'), (7, 3, 'ORD007'), (8, 3, 'ORD008'),
(9, 4, 'ORD009');

INSERT INTO c VALUES
(1, 1, 'P001'), (2, 1, 'P002'),
(3, 2, 'P003'), (4, 2, 'P004'),
(5, 3, 'P005');

SELECT a.id, (SELECT max(orderno) FROM b WHERE b.c_id=c.id) as maxorderno
  FROM a
  INNER JOIN c on c.a_id = a.id
 GROUP BY a.id;


  --              | maxorder-   --- Yields correct results, but unsafely.
  --      id      | no
  -- ------------ | ---------
  --       1      | ORD005
  --       2      | ORD009
  --       3      | NULL

SELECT a.id, MAX((SELECT orderno FROM b WHERE b.c_id=c.id)) as maxorderno
  FROM a
  INNER JOIN c on c.a_id = a.id
 GROUP BY a.id;


-- | maxorder- --- These results are wrong because MAX is operating -- id | no --- only on one inner set which happened to be picked
  -- ------------ | ---------   --- last - a Random function at best.
  --       1      | ORD004
  --       2      | ORD009
  --       3      | NULL

SELECT a.id, (SELECT MAX(orderno) FROM b WHERE b.c_id=c.id) as maxorderno
  FROM a
  INNER JOIN c on c.a_id = a.id
 GROUP BY a.id, maxorderno;


  --     | maxorder-   --- By adding maxorderno tot he group, this shows
  --  id | no          --- the Query planner's thinking
  -- --- | ---------
  --  1  | ORD003
  --  1  | ORD005
  --  2  | ORD008
  --  2  | ORD009
  --  3  | NULL

SELECT a.id, MAX((SELECT MAX(orderno) FROM b WHERE b.c_id=c.id)) as maxorderno
  FROM a
  INNER JOIN c on c.a_id = a.id
 GROUP BY a.id;


-- | maxorder- --- This is the only fully correct version of your
  --      id      | no          --- original query
  -- ------------ | ---------
  --       1      | ORD005
  --       2      | ORD009
  --       3      | NULL


SELECT a.id, MAX(b.orderno) as maxorderno
  FROM a
  INNER JOIN c on c.a_id = a.id
  INNER JOIN b ON b.c_id = c.id
GROUP BY a.id;


-- | maxorder- --- This is what I think you might actually want, -- id | no --- it precludes NULLs and is sure to pick the
  -- ------------ | ---------   --- correct max when joined to b too.
  --       1      | ORD005
  --       2      | ORD009


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to