[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-30 Thread Darren Duncan
All of the responses I've seen to this question so far seem logically wrong, or 
at the very least are different than I would do it, and my proposal is one that 
should work reliably on any DBMS.  You use a subquery in the FROM clause.

select currency, price
from (
   select currency, day, max(time) as time
   from prices
   group by currency, day
) as filter
   inner join prices using (currency, day, time)

The issue here is you want to return other details, the price, associated with 
the latest time per currency-day, and you can't do that in SQL without having a 
select query nested in another one; the inner determines the latest time per 
currency-day and the outer one looks up other info related to it.

The above example should also perform very efficiently, besides being reliably 
correct rather than just accidentally correct.

-- Darren Duncan



[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-29 Thread James K. Lowden
On Tue, 29 Sep 2015 15:05:42 +0800
Rowan Worth  wrote:

> Imagine a DB storing a history of currency exchange rate info.
> 
> CREATE TABLE Prices (
> id INTEGER PRIMARY KEY,
> day INTEGER,
> currency TEXT,
> price FLOAT);

Better for your purpose would be: 

CREATE TABLE Prices (
day INTEGER not NULL,
currency TEXT not NULL,
price FLOAT not NULL,
PRIMARY KEY(day, currency)
);

You'll note that your queries don't use the id column, and nothing in
your table definition ensures there's only one price per currency per
day.  

If it were me, I'd make day TEXT and add a constraint that requires the
month portion be between 1 and 12.

> if you want to know the latest prices, is this legal?
> 
> SELECT currency, price FROM Prices GROUP BY currency HAVING time =
> MAX(time);

What you want is called existential quantification.  The standard SQL
for it is a correlated subquery, viz:

SELECT *
FROM Prices as p 
where exists (
select 1 from Prices
where currency = p.currency
GROUP BY currency
HAVING p.time = MAX(time)
);

That version might be slower in SQLite than LIMIT 1, although it
shouldn't be.  You'll have to test it.  The advantage to you is that
it's standard SQL.  It will work on any SQL DBMS, and will help you
think about the issue in terms of sets instead of having to rely on
a crutch like LIMIT.  

--jkl


[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-29 Thread Rowan Worth
Hi guys,

Imagine a DB storing a history of currency exchange rate info.

CREATE TABLE Prices (
id INTEGER PRIMARY KEY,
day INTEGER,
currency TEXT,
price FLOAT);

Assume 'day' increases monotonically and there is at most one price
recorded per currency per day - if you want to know the latest prices, is
this legal?

SELECT currency, price FROM Prices GROUP BY currency HAVING time =
MAX(time);

I've seen this approach used and my reading of
https://www.sqlite.org/lang_select.html#resultset hasn't left me confident
that it is correct SQL, but in practice it seems to work fine... eg:

insert into Prices (day, currency, price) values (1, 'EUR', 1.86);
insert into Prices (day, currency, price) values (1, 'USD', 1.20);
insert into Prices (day, currency, price) values (2, 'EUR', 1.92);
insert into Prices (day, currency, price) values (2, 'USD', 1.18);
insert into Prices (day, currency, price) values (3, 'USD', 1.24);
/* price for EUR on day 3 intentionally left blank */

sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day =
MAX(day);
EUR|1.92
USD|1.24

MIN also seems to work:

sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day =
MIN(day);
EUR|1.86
USD|1.2

As an aside, the query definitely doesn't work to get prices for a specific
day:

sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day = 2;
EUR|1.92

I think the following explains the last result:

> If a HAVING clause is a non-aggregate expression, it is evaluated with
> respect to an arbitrarily selected row from the group.

And the arbitrary row for EUR just happened to be day=2 here. But the
documentation doesn't mention non-aggregate queries with a group by
clause...


> If the HAVING clause is an aggregate expression, it is evaluated across
all
> rows in the group.

I'm not sure how to interpret this in the context of the "day = MAX(day)"
expression though - does "evaluated across all rows" mean that rows
excluded by the clause are discarded?

If not then I suspect this query is not legit... but I don't feel I've
understood everything so any clarification is appreciated!

Cheers,
-Rowan


[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-29 Thread Gabor Grothendieck
On Tue, Sep 29, 2015 at 6:16 AM, Simon Slavin  wrote:

> I don't like using sub-SELECT and I would love to see another reader
> rephrase this using 'WITH' or a VIEW.
>

with sub as (select currency, price, max(day) from prices group by currency)
 select currency, price from sub;


[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-29 Thread Simon Slavin

On 29 Sep 2015, at 8:05am, Rowan Worth  wrote:

> CREATE TABLE Prices (
>id INTEGER PRIMARY KEY,
>day INTEGER,
>currency TEXT,
>price FLOAT);
> 
> Assume 'day' increases monotonically and there is at most one price
> recorded per currency per day - if you want to know the latest prices, is
> this legal?
> 
> SELECT currency, price FROM Prices GROUP BY currency HAVING time =
> MAX(time);

I see others have already told you not to use this, but I'd like to work 
towards a better command.  To find the latest price for any one currency do 
this:

SELECT currency,price FROM Prices
WHERE currency='EUR'
ORDER BY day DESC
LIMIT 1;

This finds all the rows for the right currency, then sorts them so that the 
latest is at the top, then returns just the top row.  If SQL has an index like 
the following:

CREATE INDEX Prices_cd ON Prices (currency, day);

then it will use this index to jump straight to the correct row to return, and 
it will return the result very quickly.  Even better, this index will speed up 
the other SELECT involved in this.

So to use the above command with every currency you need another SELECT which 
will pick out each currency you have data for.  You can do that like this:

SELECT DISTINCT currency FROM Prices;

So you should be able to combine the two:

SELECT currency AS thisCCy,price FROM Prices
WHERE price = (SELECT price FROM Prices
WHERE currency = thisCCy
ORDER BY day DESC
LIMIT 1);

or, because we have an index which will make this faster,

SELECT currency AS thisCCy,price FROM Prices
WHERE day = (SELECT day FROM Prices
WHERE currency = thisCCy
ORDER BY day DESC
LIMIT 1);

I don't like using sub-SELECT and I would love to see another reader rephrase 
this using 'WITH' or a VIEW.

As an incidental note, I advise you to store the time-of-day of each price in 
your Prices file even if you store only one price per day.  Any auditor of your 
system will find their job easier if they can easily figure out which price 
(from those throughout the day) you decided to use.  It may not make things 
easier in the day-to-day running of the system but it will make things easier 
if it's ever audited, or if you every have to discuss the data with a customer.


[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-29 Thread Clemens Ladisch
Rowan Worth wrote:
> if you want to know the latest prices, is this legal?
>
> SELECT currency, price FROM Prices GROUP BY currency HAVING time = MAX(time);

It is not legal ANSI SQL, and most other databases will complain.

While the WHERE clause allows to filter out rows from the table, the
HAVING clause is evaluated _after_ the groups have already been created,
and allows to filter out entire groups (i.e., output rows).

In this particular query, "HAVING time = MAX(time)" throws away any
group where the "time" value does not happen to be the maximum value
in the group.  What makes this query actually work is that SQLite has
a (somewhat undocumented) extension where using a single MAX() or MIN()
determines the row in a group where unaggregated expressions come from
(instead of using some random row).  But this is just a side effect of
the expression in the HAVING clause; using HAVING like this is an
extremely misleading way to write this query.  Better write the query
like this:

  SELECT currency, price, MAX(time) FROM Prices GROUP BY currency;

(Same extension, so this is not legal ANSI SQL either.)

Neither your original query nor this query work if there is more than
one MIN()/MAX(), or if some other aggregation function is used.


Regards,
Clemens


[sqlite] Is it legal SQL to select a specific row of a group via a HAVING clause?

2015-09-29 Thread Hick Gunter
The main difference between HAVING and WHERE ist that WHERE operates on the 
input set and HAVING operates on the output set.

If your condition requires computing an aggregate, then HAVING is a viable 
method of avoiding a subquery that needs to repeatedly scan the input table:

SELECT currency,price FROM Prices p WHERE day = (SELECT max(day) from Prices 
where currency=p.currency);

explain query plan SELECT currency,price FROM Prices p WHERE day = (SELECT 
max(day) from Prices where currency=p.currency);
sele  order  from  deta
  -    
0 0  0 SCAN TABLE Prices AS p (~50 rows)
0 0  0 EXECUTE CORRELATED SCALAR SUBQUERY 1
1 0  0 SEARCH TABLE Prices USING AUTOMATIC COVERING INDEX 
(currency=?) (~1 rows)

explain query plan SELECT currency, price FROM Prices GROUP BY currency HAVING 
day = max(day);
sele  order  from  deta
  -    
0 0  0 SCAN TABLE Prices (~100 rows)
0 0  0 USE TEMP B-TREE FOR GROUP BY

If your condition does not require computing an aggregate, just use WHERE.

SELECT currency,price from Prices WHERE day=2;

To find the latest price on or before day 3:

SELECT currency,price from Prices WHERE day <= 3 GROUP by currency HAVING day = 
max(day);

explain query plan SELECT currency,price from Prices WHERE day <= 3 GROUP by 
currency HAVING day = max(day);
sele  order  from  deta
  -    
0 0  0 SCAN TABLE Prices (~33 rows)
0 0  0 USE TEMP B-TREE FOR GROUP BY

Note that this is without any defined indexes. Plans may change depending on 
indexes added, and the cardinality of constrained fields.




-Urspr?ngliche Nachricht-
Von: Rowan Worth [mailto:rowanw at dugeo.com]
Gesendet: Dienstag, 29. September 2015 09:06
An: General Discussion of SQLite Database
Betreff: [sqlite] Is it legal SQL to select a specific row of a group via a 
HAVING clause?

Hi guys,

Imagine a DB storing a history of currency exchange rate info.

CREATE TABLE Prices (
id INTEGER PRIMARY KEY,
day INTEGER,
currency TEXT,
price FLOAT);

Assume 'day' increases monotonically and there is at most one price recorded 
per currency per day - if you want to know the latest prices, is this legal?

SELECT currency, price FROM Prices GROUP BY currency HAVING time = MAX(time);

I've seen this approach used and my reading of 
https://www.sqlite.org/lang_select.html#resultset hasn't left me confident that 
it is correct SQL, but in practice it seems to work fine... eg:

insert into Prices (day, currency, price) values (1, 'EUR', 1.86); insert into 
Prices (day, currency, price) values (1, 'USD', 1.20); insert into Prices (day, 
currency, price) values (2, 'EUR', 1.92); insert into Prices (day, currency, 
price) values (2, 'USD', 1.18); insert into Prices (day, currency, price) 
values (3, 'USD', 1.24);
/* price for EUR on day 3 intentionally left blank */

sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day
sqlite> =
MAX(day);
EUR|1.92
USD|1.24

MIN also seems to work:

sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day
sqlite> =
MIN(day);
EUR|1.86
USD|1.2

As an aside, the query definitely doesn't work to get prices for a specific
day:

sqlite> SELECT currency, price FROM Prices GROUP BY currency HAVING day
sqlite> = 2;
EUR|1.92

I think the following explains the last result:

> If a HAVING clause is a non-aggregate expression, it is evaluated with
> respect to an arbitrarily selected row from the group.

And the arbitrary row for EUR just happened to be day=2 here. But the 
documentation doesn't mention non-aggregate queries with a group by clause...


> If the HAVING clause is an aggregate expression, it is evaluated
> across
all
> rows in the group.

I'm not sure how to interpret this in the context of the "day = MAX(day)"
expression though - does "evaluated across all rows" mean that rows excluded by 
the clause are discarded?

If not then I suspect this query is not legit... but I don't feel I've 
understood everything so any clarification is appreciated!

Cheers,
-Rowan
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by