1) Create data (now I use a script in python to create 10 millions rows with random data)
test.py: import random import sqlite3 db=sqlite3.connect('players.db') c=db.cursor() c.execute("BEGIN") c.execute("CREATE TABLE player(name PRIMARY KEY, score REAL, rank INTEGER)") for i in range(10000000): if i and not i%100000: print i c.execute("INSERT INTO player VALUES ('Player%d',%f, NULL)"%(i,random.random())) c.commit() $ time python test.py > create.sql real 5m29.157s user 5m22.137s sys 0m5.457s 2) Create index on score as suggested by you $ time echo "CREATE INDEX playerScore ON PLAYER (score);" | sqlite3 players.db real 0m44.660s user 0m39.179s sys 0m3.435s 3) Also create index on rank (for now containing all nulls) $ time echo "CREATE INDEX playerRank ON PLAYER (rank);" | sqlite3 players.db real 0m42.639s user 0m38.537s sys 0m2.787s 4) Calculate rank of players $ time echo "CREATE TABLE playerCalculatedRank AS SELECT(name) FROM PLAYER ORDER BY score;" | sqlite3 players.db real 1m22.784s user 0m44.507s sys 0m37.144s 5) Index name $ time echo "CREATE INDEX playerRankName ON playerCalculatedRank (name);" | sqlite3 players.db real 0m45.452s user 0m39.530s sys 0m4.129s 6) Associate rank back to players $ time echo "UPDATE player SET RANK=(SELECT rowid FROM playerCalculatedRank WHERE playerCalculatedRank.name=player.name);" | sqlite3 players.db real 1m26.919s user 0m51.850s sys 0m9.398s 7) Now you can query efficiently scores SELECT * FROM player ORDER BY SCORE limit 10; Player2257196|0.0|1 Player3581965|0.0|2 Player5899654|0.0|3 Player6830263|0.0|4 Player7539953|0.0|5 Player813208|1.0e-06|6 Player1197941|1.0e-06|7 Player5692742|1.0e-06|8 Player6457603|1.0e-06|9 Player7434030|1.0e-06|10 $ time echo "SELECT * FROM player WHERE rank>1516116 ORDER BY rank LIMIT 10;" | sqlite3 players.db Player607250|0.151696|1516117 Player1727261|0.151696|1516118 Player1951394|0.151696|1516119 Player2423122|0.151696|1516120 Player2601434|0.151696|1516121 Player3854947|0.151696|1516122 Player4468451|0.151696|1516123 Player6382469|0.151696|1516124 Player7191942|0.151696|1516125 Player8168866|0.151696|1516126 real 0m0.004s user 0m0.002s sys 0m0.003s For doing 4+5+6 you could/should use temp table 2014-08-23 17:03 GMT+02:00 Tom <t...@meanfox.com>: > > Hi, > > I have a table of players each with a score. I need to do 2 things: > > 1. select top n players sorted on score - this is easy, just select stuff > order by score desc limit n. For efficiency, use an index on score. The top > guy is rank 1, the next guy is rank 2, etc. I only need to do this once a > day, so it doesn't have to be super fast. > > 2. determine the rank of an arbitrary player p. This is harder. I need to > do > it many times. I presently select count(*) + 1 where score is > score of > player p. > > The problem is that #2 is too slow when I have 1M players in the table - > around 80 ms (with an index on score). I need something much faster - a > few > ms is OK. > > I tried writing rank into the table as part of the leaderboard processing > by > doing a select order by score desc with an update for each row, but this is > too slow (even using begin/commit). Would take around 5 hours for 1M > players, but I need something that would take just minutes. > > Would appreciate any ideas. > > Thanks > > Tom > > > > > -- > View this message in context: > http://sqlite.1065341.n5.nabble.com/How-to-determine-player-s-leaderboard-rank-efficiently-tp77445.html > Sent from the SQLite mailing list archive at Nabble.com. > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users