Re: [sqlite] Efficient ways to maintaining order rank / row_number() in a rather large set ?

Eric Grange Tue, 09 Jan 2018 03:36:07 -0800

> Do you actually have a need to find the 4512nd rank ?

Yes, this is is used to display and check on "neighbors", ie. keys with
similar rank.
The other very common query is to show the rank for a given key.

In both cases, since things are constantly in flux, the absolute rank and
neighbor do not really matter
(outside the top ranks), but changes in rank are being looked at.
i.e. having rank 155k or 154k is not really meaningful in itself, but on
the other hand
gaining 1000 "ranks" by going from 155k to 154k is what users are after.

> 3) In your programming language, number the rows by doing
>
>        SELECT KEY FROM RANKED ORDER BY VALUE
>
>and for each row returned doing an UPDATE for the RANK value.
>It’ll be slower, but it won’t take up the huge chunk of memory needed to
keep that index in memory.

Yes, that's what I tried, but it is indeed much slower than deleting the
whole table an inserting everything all at once,
because as soon as one key moves from top tier to bottom (or vice versa),
you have to touch basically all records,
and doing so with many updates wrecks the performance completely.

> But then, if your range queries are based on a rank derived from value,
why
> not index value directly? You'd still get fast range queries based on
values, no?

You get fast value range queries, but rank range queries become slower and
slower the farther away you get from the top rank.

Also while most end-user queries are for the top 100 / top 1000, those
results are cached and only infrequently
hit the db (so even if finding out the top 1000 was slow and inefficient,
it would not really matter).
In practice, the queries that really hit on the db are for "random" keys
far from the top 1000.

Eric

On Tue, Jan 9, 2018 at 11:44 AM, Dominique Devienne <[email protected]>
wrote:

> On Tue, Jan 9, 2018 at 11:26 AM, Eric Grange <[email protected]> wrote:
>
> > So the order by is used to control the insertion order, so that the RANK
> > autoinc primary key ends up with natural rank order
>
>
> But then, if your range queries are based on a rank derived from value, why
> not index value directly?
> You'd still get fast range queries based on values, no? That probably
> forces you app to maintain a
> mapping from "rank" to values, but you don't need to know all ranks, just
> the few necessary to know
> where to "page" from, no? (assuming I'm guess what you use rank for
> correctly).
>
> SQLite then maintain that index automatically, no? I'm probably missing
> something though. Sounds too simple... --DD
> _______________________________________________
> sqlite-users mailing list
> [email protected]
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Efficient ways to maintaining order rank / row_number() in a rather large set ?

Reply via email to