Michael wrote: > I have 16 - 25 simultaneous processes running selecting records (to process) > from a DB that has about 3.3 million lines, quite standard- > SELECT a,b WHERE x, y and z ORDER BY rand() LIMIT 1 > > Works fine up to about 8-10 processes, and then goes exponentially slower > until queries take over a minute to answer at around 20 processes. Note that > not all processes are hitting the DB at the same time, but they all tend to > come to a halt together by the time I am running 20 of them. > > So this is what each process does- > > 1. SELECT a,b where x,y,z and DONE=no. > > 2. UPDATE a field to apply a 'lock' of sorts on that row, so another process > doesn't also do it. > > do yada yada with the data (Can take a few seconds to a few minutes) > > 3. UPDATE again, this time with a 'done' statement > > Run loop again. > > Now with every command I am running order by rand() - this is to greatly > minimise the chance of another process grabbing the same row in the period > between 1 and 2.
Order by RAND() is terrible performance-wise. Each select statement has to generate 3.3 million random numbers, find the lowest, and then throw them away! It's probably much more efficient to generate Sounds like you want to use a transaction here, instead of relying on hacks like this to avoid other processes grabbing your row. -- Tim Oliver Software Engineer 160 Cashel Street (Level 2) PO Box 22 128 Christchurch New Zealand P +64 3 377 0007 F +64 3 377 6582 E [email protected] www.e2media.co.nz --~--~---------~--~----~------------~-------~--~----~ NZ PHP Users Group: http://groups.google.com/group/nzphpug To post, send email to [email protected] To unsubscribe, send email to [email protected] -~----------~----~----~----~------~----~------~--~---
