On Saturday 30 August 2003 1:08 pm, you wrote: > On Sat, 30 Aug 2003, Richard Jones wrote: > > Hi, > > i have a table of around 3 million rows from which i regularly (twice a > > second at the moment) need to select a random row from > > > > currently i'm doing "order by rand() limit 1" - but i suspect this is > > responsible for the large load on my db server - i guess that PG is doing > > far too much work just to pick one row. > > If you have an int id (aka serial) column then it is simple - just pick a > random number between 1 and currval('id_seq')... > > or offset rand() limit 1 perhaps? > > since you want random ther eis no need to bother with an order and that'll > save a sort.
Yes, the pkey is a SERIAL but the problem is that the sequence is rather sparse for example, it goes something like 1 -> 5000 then 100000->100000 and then 2000000->upwards this is due to chunks being deleted etc.. if i pick a random number for the key it will not be a random enough distribution, because the sequence is sparse.. sometimes it will pick a key that doesnt exist. i'm currently reading all the keys into an array and selecting randoms from there - but this is no good long-term as i need to refresh the array of keys to take into account newly added rows to the table (daily) i was hoping there was some trickery with sequences that would allow me to easily pick a random valid sequence number..? Thanks, Rich. > > -- > Jeff Trout <[EMAIL PROTECTED]> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org