On 2018/06/24 9:30 PM, Gert Van Assche wrote:
whaw... I could never come up with solutions like this. Will this work with
SQLite?
Where can I read more about this?

I've only shown that query since Dan asked about it - it's not an SQLite query.

To answer/comment on your questions:

1. Sure you can come up with it if you have read about it and understand what it is intended for - There is nothing you can't know, only things you don't know yet. :)

2. This will not work in SQLite currently - it's part of a group of functions called "Windowing" functions built into the bigger DB engines which do not care about their "Liteness". "Windowing functions" is basically a set of functions that treats the current ROW and its immediate preceding or acceding rows, or set of grouped rows within a bigger query (grouping by whatever common aspect you choose) as a unit and can return localized aggregates over them. The ranking/row position functions prove especially handy when wanting queries to easily produce row orders, ranks, etc. - but I promise you, they all are very possible in SQLite without Windowing functions, just not quite as obvious (as this thread has shown). More difficult to do are things like Percentile or Running-Total, though those too can be accomplished with a self-join or correlated sub-query in SQLite. The example in the original mail is from MSSQL, but similar queries work in Postgres, Oracle, etc.

3. You can read more by Googling "RDBMS Windowing functions" - or just click here: http://google.com/?q=RDBMS+windowing+functions


Quick intro to Windowing in SQL (Apologies to all, the following is NOT for SQLite [yet]):

Say I have a list of students in three classrooms and their test scores, and I would like to show the listing ranked by their scores, per class.


Sorting makes part of the problem easy:

SELECT class, student, score
  FROM students
 ORDER BY class, student, score DESC
;  -- This works in SQLite

This might produce a list like:

class, student, score
A5, Ann,     94
A5, Able,    88
A5, Adrian,  67
B5, James,   92
B5, Jenny,   88
B5, John,    87
B5, Joan,    74
C5, Sloan,   98
C5, Sean,    79

Now a common thing is to want to put a rank integer number in front of every classmate so it is easy to see position, like this (spaced for legibility only):

rank, class, student, score
1, A5, Ann,     94
2, A5, Able,    88
3, A5, Adrian,  67

1, B5, James,   92
2, B5, Jenny,   88
3, B5, John,    87
4, B5, Joan,    74

1, C5, Sloan,   98
2, C5, Sean,    79

And for that the query will have to produce the whole list (no overall grouping) but then group these students by "class" to figure out the rank within every class "window" of the  larger query.

You could say in English: We need to show the RANK *over* every window (aka "set of records") that is *partitioned by* the "class" field, where the rank within each "class" is determined by the value of the "score" field in a *descending order*.  That's easy enough to understand (I hope!).

All that remains is to translate that to SQL. Looking at the English phrase above and the SQL below should quickly clarify how it all fits together:

SELECT RANK() OVER (PARTITION BY class ORDER BY score DESC) as rank, class, student, score
  FROM students
 ORDER BY class, student, score DESC
;  -- This example is MSSQL format


If you don't care about the rank, and simply wish to number the rows in some order (student name for example), you could opt for something like:

SELECT ROW_NUMBER() OVER (PARTITION BY class ORDER BY student) as student_no, class, student, score
  FROM students
 ORDER BY class, student
;  -- MSSQL again

which will output this:

student_no, class, student, score
1, A5, Able,    88
2, A5, Adrian,  67
3, A5, Ann,     94
1, B5, James,   92
2, B5, Jenny,   88
3, B5, Joan,    74
4, B5, John,    87
1, C5, Sean,    79
2, C5, Sloan,   98

Of course the aggregate doesn't need to be a rank or row number, it can simply be a count() or Avg() or any aggregate function you fancy.

Lastly, if I now want to only see the top 2 ranked students of every class (which is like your original question in this thread), I can simply have the RANK query above as an inner query and in the outer query add "... WHERE rank <= 2; "[1].

rank, class, student, score
1, A5, Ann,     94
2, A5, Able,    88
1, B5, James,   92
2, B5, Jenny,   88
1, C5, Sloan,   98
2, C5, Sean,    79


Now you know. :)
Cheers,
Ryan


[1] - Rank is only useful as a value if the order-by is of a unique field (or combination of fields). Consider if the top 5 students had scores like 95, 88, 88, 88 and 75. The RANK() function over them would produce ranks 1, 2, 2, 2, 5 - because technically the 3 middle scores are all the same and so all ranked equally 2nd, while the 5th score is 3rd highest and will be ranked either 5th or 3rd based on whether you use RANK() [5th] or DENSE_RANK() [3rd] functions.

To change this to a solid numbered list, it is best to use ROW_NUMBER() which forces unique follow-on integers without "holes" in them to once again produce 1,2,3,4 and 5.


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to