[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
All of the responses I've seen to this question so far seem logically wrong, or at the very least are different than I would do it, and my proposal is one that should work reliably on any DBMS. You use a subquery in the FROM clause. select currency, price from ( select currency, day, max(time) as time from prices group by currency, day ) as filter inner join prices using (currency, day, time) The issue here is you want to return other details, the price, associated with the latest time per currency-day, and you can't do that in SQL without having a select query nested in another one; the inner determines the latest time per currency-day and the outer one looks up other info related to it. The above example should also perform very efficiently, besides being reliably correct rather than just accidentally correct. -- Darren Duncan
[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
On Tue, 29 Sep 2015 15:05:42 +0800 Rowan Worth wrote: > Imagine a DB storing a history of currency exchange rate info. > > CREATE TABLE Prices ( > id INTEGER PRIMARY KEY, > day INTEGER, > currency TEXT, > price FLOAT); Better for your purpose would be: CREATE TABLE Prices ( day INTEGER not NULL, currency TEXT not NULL, price FLOAT not NULL, PRIMARY KEY(day, currency) ); You'll note that your queries don't use the id column, and nothing in your table definition ensures there's only one price per currency per day. If it were me, I'd make day TEXT and add a constraint that requires the month portion be between 1 and 12. > if you want to know the latest prices, is this legal? > > SELECT currency, price FROM Prices GROUP BY currency HAVING time = > MAX(time); What you want is called existential quantification. The standard SQL for it is a correlated subquery, viz: SELECT * FROM Prices as p where exists ( select 1 from Prices where currency = p.currency GROUP BY currency HAVING p.time = MAX(time) ); That version might be slower in SQLite than LIMIT 1, although it shouldn't be. You'll have to test it. The advantage to you is that it's standard SQL. It will work on any SQL DBMS, and will help you think about the issue in terms of sets instead of having to rely on a crutch like LIMIT. --jkl
[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
Hi guys, Imagine a DB storing a history of currency exchange rate info. CREATE TABLE Prices ( id INTEGER PRIMARY KEY, day INTEGER, currency TEXT, price FLOAT); Assume 'day' increases monotonically and there is at most one price recorded per currency per day - if you want to know the latest prices, is this legal? SELECT currency, price FROM Prices GROUP BY currency HAVING time = MAX(time); I've seen this approach used and my reading of https://www.sqlite.org/lang_select.html#resultset hasn't left me confident that it is correct SQL, but in practice it seems to work fine... eg: insert into Prices (day, currency, price) values (1, 'EUR', 1.86); insert into Prices (day, currency, price) values (1, 'USD', 1.20); insert into Prices (day, currency, price) values (2, 'EUR', 1.92); insert into Prices (day, currency, price) values (2, 'USD', 1.18); insert into Prices (day, currency, price) values (3, 'USD', 1.24); /* price for EUR on day 3 intentionally left blank */ sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day = MAX(day); EUR|1.92 USD|1.24 MIN also seems to work: sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day = MIN(day); EUR|1.86 USD|1.2 As an aside, the query definitely doesn't work to get prices for a specific day: sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day = 2; EUR|1.92 I think the following explains the last result: > If a HAVING clause is a non-aggregate expression, it is evaluated with > respect to an arbitrarily selected row from the group. And the arbitrary row for EUR just happened to be day=2 here. But the documentation doesn't mention non-aggregate queries with a group by clause... > If the HAVING clause is an aggregate expression, it is evaluated across all > rows in the group. I'm not sure how to interpret this in the context of the "day = MAX(day)" expression though - does "evaluated across all rows" mean that rows excluded by the clause are discarded? If not then I suspect this query is not legit... but I don't feel I've understood everything so any clarification is appreciated! Cheers, -Rowan
[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
On Tue, Sep 29, 2015 at 6:16 AM, Simon Slavin wrote: > I don't like using sub-SELECT and I would love to see another reader > rephrase this using 'WITH' or a VIEW. > with sub as (select currency, price, max(day) from prices group by currency) select currency, price from sub;
[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
On 29 Sep 2015, at 8:05am, Rowan Worth wrote: > CREATE TABLE Prices ( >id INTEGER PRIMARY KEY, >day INTEGER, >currency TEXT, >price FLOAT); > > Assume 'day' increases monotonically and there is at most one price > recorded per currency per day - if you want to know the latest prices, is > this legal? > > SELECT currency, price FROM Prices GROUP BY currency HAVING time = > MAX(time); I see others have already told you not to use this, but I'd like to work towards a better command. To find the latest price for any one currency do this: SELECT currency,price FROM Prices WHERE currency='EUR' ORDER BY day DESC LIMIT 1; This finds all the rows for the right currency, then sorts them so that the latest is at the top, then returns just the top row. If SQL has an index like the following: CREATE INDEX Prices_cd ON Prices (currency, day); then it will use this index to jump straight to the correct row to return, and it will return the result very quickly. Even better, this index will speed up the other SELECT involved in this. So to use the above command with every currency you need another SELECT which will pick out each currency you have data for. You can do that like this: SELECT DISTINCT currency FROM Prices; So you should be able to combine the two: SELECT currency AS thisCCy,price FROM Prices WHERE price = (SELECT price FROM Prices WHERE currency = thisCCy ORDER BY day DESC LIMIT 1); or, because we have an index which will make this faster, SELECT currency AS thisCCy,price FROM Prices WHERE day = (SELECT day FROM Prices WHERE currency = thisCCy ORDER BY day DESC LIMIT 1); I don't like using sub-SELECT and I would love to see another reader rephrase this using 'WITH' or a VIEW. As an incidental note, I advise you to store the time-of-day of each price in your Prices file even if you store only one price per day. Any auditor of your system will find their job easier if they can easily figure out which price (from those throughout the day) you decided to use. It may not make things easier in the day-to-day running of the system but it will make things easier if it's ever audited, or if you every have to discuss the data with a customer.
[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
Rowan Worth wrote: > if you want to know the latest prices, is this legal? > > SELECT currency, price FROM Prices GROUP BY currency HAVING time = MAX(time); It is not legal ANSI SQL, and most other databases will complain. While the WHERE clause allows to filter out rows from the table, the HAVING clause is evaluated _after_ the groups have already been created, and allows to filter out entire groups (i.e., output rows). In this particular query, "HAVING time = MAX(time)" throws away any group where the "time" value does not happen to be the maximum value in the group. What makes this query actually work is that SQLite has a (somewhat undocumented) extension where using a single MAX() or MIN() determines the row in a group where unaggregated expressions come from (instead of using some random row). But this is just a side effect of the expression in the HAVING clause; using HAVING like this is an extremely misleading way to write this query. Better write the query like this: SELECT currency, price, MAX(time) FROM Prices GROUP BY currency; (Same extension, so this is not legal ANSI SQL either.) Neither your original query nor this query work if there is more than one MIN()/MAX(), or if some other aggregation function is used. Regards, Clemens
[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?
The main difference between HAVING and WHERE ist that WHERE operates on the input set and HAVING operates on the output set. If your condition requires computing an aggregate, then HAVING is a viable method of avoiding a subquery that needs to repeatedly scan the input table: SELECT currency,price FROM Prices p WHERE day = (SELECT max(day) from Prices where currency=p.currency); explain query plan SELECT currency,price FROM Prices p WHERE day = (SELECT max(day) from Prices where currency=p.currency); sele order from deta - 0 0 0 SCAN TABLE Prices AS p (~50 rows) 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 1 0 0 SEARCH TABLE Prices USING AUTOMATIC COVERING INDEX (currency=?) (~1 rows) explain query plan SELECT currency, price FROM Prices GROUP BY currency HAVING day = max(day); sele order from deta - 0 0 0 SCAN TABLE Prices (~100 rows) 0 0 0 USE TEMP B-TREE FOR GROUP BY If your condition does not require computing an aggregate, just use WHERE. SELECT currency,price from Prices WHERE day=2; To find the latest price on or before day 3: SELECT currency,price from Prices WHERE day <= 3 GROUP by currency HAVING day = max(day); explain query plan SELECT currency,price from Prices WHERE day <= 3 GROUP by currency HAVING day = max(day); sele order from deta - 0 0 0 SCAN TABLE Prices (~33 rows) 0 0 0 USE TEMP B-TREE FOR GROUP BY Note that this is without any defined indexes. Plans may change depending on indexes added, and the cardinality of constrained fields. -Urspr?ngliche Nachricht- Von: Rowan Worth [mailto:rowanw at dugeo.com] Gesendet: Dienstag, 29. September 2015 09:06 An: General Discussion of SQLite Database Betreff: [sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause? Hi guys, Imagine a DB storing a history of currency exchange rate info. CREATE TABLE Prices ( id INTEGER PRIMARY KEY, day INTEGER, currency TEXT, price FLOAT); Assume 'day' increases monotonically and there is at most one price recorded per currency per day - if you want to know the latest prices, is this legal? SELECT currency, price FROM Prices GROUP BY currency HAVING time = MAX(time); I've seen this approach used and my reading of https://www.sqlite.org/lang_select.html#resultset hasn't left me confident that it is correct SQL, but in practice it seems to work fine... eg: insert into Prices (day, currency, price) values (1, 'EUR', 1.86); insert into Prices (day, currency, price) values (1, 'USD', 1.20); insert into Prices (day, currency, price) values (2, 'EUR', 1.92); insert into Prices (day, currency, price) values (2, 'USD', 1.18); insert into Prices (day, currency, price) values (3, 'USD', 1.24); /* price for EUR on day 3 intentionally left blank */ sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day sqlite> = MAX(day); EUR|1.92 USD|1.24 MIN also seems to work: sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day sqlite> = MIN(day); EUR|1.86 USD|1.2 As an aside, the query definitely doesn't work to get prices for a specific day: sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day sqlite> = 2; EUR|1.92 I think the following explains the last result: > If a HAVING clause is a non-aggregate expression, it is evaluated with > respect to an arbitrarily selected row from the group. And the arbitrary row for EUR just happened to be day=2 here. But the documentation doesn't mention non-aggregate queries with a group by clause... > If the HAVING clause is an aggregate expression, it is evaluated > across all > rows in the group. I'm not sure how to interpret this in the context of the "day = MAX(day)" expression though - does "evaluated across all rows" mean that rows excluded by the clause are discarded? If not then I suspect this query is not legit... but I don't feel I've understood everything so any clarification is appreciated! Cheers, -Rowan ___ sqlite-users mailing list sqlite-users at mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: hick at scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by