On Mon, Sep 15, 2008 at 10:57:37PM -0700, Tomas Lee scratched on the wall: > I want to delete the 1000 members with the lowest scores. Assume that > it is extremely unlikely for two members to have identical scores. > Also, the 1000 is arbitrary -- it could be more or less on different > days. But each time I do this deletion, it will be a fixed number. > > What's the fastest way to do this?
Your best bet is to just try things out and see. > Method A: > * find the 1000th lowest score: > SELECT score FROM members ORDER BY score LIMIT 1 OFFSET 999; > * delete the records equal to or lower than that score > DELETE FROM members WHERE score <= $thousandth_lowest_score; Do it as one command using a sub-select: DELETE FROM members WHERE score <= ( SELECT score FROM members ORDER BY score LIMIT 1 OFFSET 999); Personally I don't like this, since it has the potential to delete too many records. Having repeating scores may be rare, but that's not never. On the other hand, getting rid of "ties" at the cut-off point may be desirable. > Method B: > * find the uids for the 1000 lowest scores: > SELECT uid FROM members ORDER BY score LIMIT 1000; > * delete those records > DELETE FROM members WHERE uid IN ([join $uids ,]); Again, do it as one command: DELETE FROM members WHERE uid IN ( SELECT uid FROM members ORDER BY score LIMIT 1000); I'm not sure about speed, but I like this best from from a readability standpoint. It is clean and straight forward, and I wouldn't be surprised to find out it is the fastest. The only odd thing about this is if several records share the cut-off score, there isn't any good way of knowing which will be deleted and which will be left behind. That may or may not matter. > Method C: > * delete the records as you find them: > > sqlite3_prepare_v2(db, "DELETE FROM members WHERE uid = ?", -1, &stmt_d, > NULL); > sqlite3_prepare_v2(db, "SELECT uid FROM members ORDER BY score LIMIT > 1000", -1, &stmt_q, NULL); > while (sqlite3_step(stmt_q) == SQLITE_ROW) { > int uid = sqlite3_column_int(stmt_q, 0); > sqlite3_bind_int(stmt_d, 0, uid); > sqlite3_step(stmt_d); > sqlite3_reset(stmt_d); > } > sqlite3_finalize(stmt_d); > sqlite3_finalize(stmt_q); Essentially a manual version of Method B. Which is fastest may depend on the contents of the table and if the index will actually be used or not. There might also be variations if the size (the 1000) changes significantly. Again, just try it and see. Using the sub-selects you can do this from the command line on a test database to give you a rough idea. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Our opponent is an alien starship packed with atomic bombs. We have a protractor." "I'll go home and see if I can scrounge up a ruler and a piece of string." --from Anathem by Neal Stephenson _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users