1) Create data (now I use a script in python to create 10 millions rows
with random data)

test.py:

import random
import sqlite3
db=sqlite3.connect('players.db')
c=db.cursor()
c.execute("BEGIN")
c.execute("CREATE TABLE player(name PRIMARY KEY, score REAL, rank INTEGER)")
for i in range(10000000):
  if i and not i%100000:
    print i
  c.execute("INSERT INTO player VALUES ('Player%d',%f,
NULL)"%(i,random.random()))
c.commit()

$ time python test.py > create.sql
real    5m29.157s
user    5m22.137s
sys    0m5.457s

2) Create index on score as suggested by you
$ time echo "CREATE INDEX playerScore ON PLAYER (score);" | sqlite3
players.db

real    0m44.660s
user    0m39.179s
sys    0m3.435s

3) Also create index on rank (for now containing all nulls)
$ time echo "CREATE INDEX playerRank ON PLAYER (rank);" | sqlite3
players.db

real    0m42.639s
user    0m38.537s
sys    0m2.787s

4) Calculate rank of players
$ time echo "CREATE TABLE playerCalculatedRank AS SELECT(name) FROM PLAYER
ORDER BY score;" | sqlite3 players.db

real    1m22.784s
user    0m44.507s
sys    0m37.144s

5) Index name
$ time echo "CREATE INDEX playerRankName ON playerCalculatedRank (name);" |
sqlite3 players.db

real    0m45.452s
user    0m39.530s
sys    0m4.129s

6) Associate rank back to players
$ time echo "UPDATE player SET RANK=(SELECT rowid FROM playerCalculatedRank
WHERE playerCalculatedRank.name=player.name);" | sqlite3 players.db

real    1m26.919s
user    0m51.850s
sys    0m9.398s

7) Now you can query efficiently scores

SELECT * FROM player ORDER BY SCORE limit 10;
Player2257196|0.0|1
Player3581965|0.0|2
Player5899654|0.0|3
Player6830263|0.0|4
Player7539953|0.0|5
Player813208|1.0e-06|6
Player1197941|1.0e-06|7
Player5692742|1.0e-06|8
Player6457603|1.0e-06|9
Player7434030|1.0e-06|10

$ time echo "SELECT * FROM player WHERE rank>1516116 ORDER BY rank LIMIT
10;" | sqlite3 players.db
Player607250|0.151696|1516117
Player1727261|0.151696|1516118
Player1951394|0.151696|1516119
Player2423122|0.151696|1516120
Player2601434|0.151696|1516121
Player3854947|0.151696|1516122
Player4468451|0.151696|1516123
Player6382469|0.151696|1516124
Player7191942|0.151696|1516125
Player8168866|0.151696|1516126

real    0m0.004s
user    0m0.002s
sys    0m0.003s

For doing 4+5+6 you could/should use temp table


2014-08-23 17:03 GMT+02:00 Tom <t...@meanfox.com>:

>
> Hi,
>
> I have a table of players each with a score. I need to do 2 things:
>
> 1. select top n players sorted on score - this is easy, just select stuff
> order by score desc limit n. For efficiency, use an index on score. The top
> guy is rank 1, the next guy is rank 2, etc. I only need to do this once a
> day, so it doesn't have to be super fast.
>
> 2. determine the rank of an arbitrary player p. This is harder. I need to
> do
> it many times. I presently select count(*) + 1 where score is > score of
> player p.
>
> The problem is that #2 is too slow when I have 1M players in the table -
> around 80 ms (with an index on score). I need something much faster -  a
> few
> ms is OK.
>
> I tried writing rank into the table as part of the leaderboard processing
> by
> doing a select order by score desc with an update for each row, but this is
> too slow (even using begin/commit). Would take around 5 hours for 1M
> players, but I need something that would take just minutes.
>
> Would appreciate any ideas.
>
> Thanks
>
> Tom
>
>
>
>
> --
> View this message in context:
> http://sqlite.1065341.n5.nabble.com/How-to-determine-player-s-leaderboard-rank-efficiently-tp77445.html
> Sent from the SQLite mailing list archive at Nabble.com.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to