On 2018/06/24 9:30 PM, Gert Van Assche wrote:
whaw... I could never come up with solutions like this. Will this work with
SQLite?
Where can I read more about this?
I've only shown that query since Dan asked about it - it's not an SQLite
query.
To answer/comment on your questions:
1. Sure you can come up with it if you have read about it and understand
what it is intended for - There is nothing you can't know, only things
you don't know yet. :)
2. This will not work in SQLite currently - it's part of a group of
functions called "Windowing" functions built into the bigger DB engines
which do not care about their "Liteness".
"Windowing functions" is basically a set of functions that treats the
current ROW and its immediate preceding or acceding rows, or set of
grouped rows within a bigger query (grouping by whatever common aspect
you choose) as a unit and can return localized aggregates over them.
The ranking/row position functions prove especially handy when wanting
queries to easily produce row orders, ranks, etc. - but I promise you,
they all are very possible in SQLite without Windowing functions, just
not quite as obvious (as this thread has shown). More difficult to do
are things like Percentile or Running-Total, though those too can be
accomplished with a self-join or correlated sub-query in SQLite.
The example in the original mail is from MSSQL, but similar queries work
in Postgres, Oracle, etc.
3. You can read more by Googling "RDBMS Windowing functions" - or just
click here: http://google.com/?q=RDBMS+windowing+functions
Quick intro to Windowing in SQL (Apologies to all, the following is NOT
for SQLite [yet]):
Say I have a list of students in three classrooms and their test scores,
and I would like to show the listing ranked by their scores, per class.
Sorting makes part of the problem easy:
SELECT class, student, score
FROM students
ORDER BY class, student, score DESC
; -- This works in SQLite
This might produce a list like:
class, student, score
A5, Ann, 94
A5, Able, 88
A5, Adrian, 67
B5, James, 92
B5, Jenny, 88
B5, John, 87
B5, Joan, 74
C5, Sloan, 98
C5, Sean, 79
Now a common thing is to want to put a rank integer number in front of
every classmate so it is easy to see position, like this (spaced for
legibility only):
rank, class, student, score
1, A5, Ann, 94
2, A5, Able, 88
3, A5, Adrian, 67
1, B5, James, 92
2, B5, Jenny, 88
3, B5, John, 87
4, B5, Joan, 74
1, C5, Sloan, 98
2, C5, Sean, 79
And for that the query will have to produce the whole list (no overall
grouping) but then group these students by "class" to figure out the
rank within every class "window" of the larger query.
You could say in English: We need to show the RANK *over* every window
(aka "set of records") that is *partitioned by* the "class" field, where
the rank within each "class" is determined by the value of the "score"
field in a *descending order*. That's easy enough to understand (I hope!).
All that remains is to translate that to SQL. Looking at the English
phrase above and the SQL below should quickly clarify how it all fits
together:
SELECT RANK() OVER (PARTITION BY class ORDER BY score DESC) as rank,
class, student, score
FROM students
ORDER BY class, student, score DESC
; -- This example is MSSQL format
If you don't care about the rank, and simply wish to number the rows in
some order (student name for example), you could opt for something like:
SELECT ROW_NUMBER() OVER (PARTITION BY class ORDER BY student) as
student_no, class, student, score
FROM students
ORDER BY class, student
; -- MSSQL again
which will output this:
student_no, class, student, score
1, A5, Able, 88
2, A5, Adrian, 67
3, A5, Ann, 94
1, B5, James, 92
2, B5, Jenny, 88
3, B5, Joan, 74
4, B5, John, 87
1, C5, Sean, 79
2, C5, Sloan, 98
Of course the aggregate doesn't need to be a rank or row number, it can
simply be a count() or Avg() or any aggregate function you fancy.
Lastly, if I now want to only see the top 2 ranked students of every
class (which is like your original question in this thread), I can
simply have the RANK query above as an inner query and in the outer
query add "... WHERE rank <= 2; "[1].
rank, class, student, score
1, A5, Ann, 94
2, A5, Able, 88
1, B5, James, 92
2, B5, Jenny, 88
1, C5, Sloan, 98
2, C5, Sean, 79
Now you know. :)
Cheers,
Ryan
[1] - Rank is only useful as a value if the order-by is of a unique
field (or combination of fields). Consider if the top 5 students had
scores like 95, 88, 88, 88 and 75. The RANK() function over them would
produce ranks 1, 2, 2, 2, 5 - because technically the 3 middle scores
are all the same and so all ranked equally 2nd, while the 5th score is
3rd highest and will be ranked either 5th or 3rd based on whether you
use RANK() [5th] or DENSE_RANK() [3rd] functions.
To change this to a solid numbered list, it