Michael: > I have 16 - 25 simultaneous processes running selecting records (to process) > from a DB that has about 3.3 million lines, quite standard- > SELECT a,b WHERE x, y and z ORDER BY rand() LIMIT 1 > > Works fine up to about 8-10 processes, and then goes exponentially slower > until queries take over a minute to answer at around 20 processes. Note that > not all processes are hitting the DB at the same time, but they all tend to > come to a halt together by the time I am running 20 of them. > > So this is what each process does- > > 1. SELECT a,b where x,y,z and DONE=no. > > 2. UPDATE a field to apply a 'lock' of sorts on that row, so another process > doesn't also do it. > > do yada yada with the data (Can take a few seconds to a few minutes) > > 3. UPDATE again, this time with a 'done' statement > > Run loop again. > > Now with every command I am running order by rand() - this is to greatly > minimise the chance of another process grabbing the same row in the period > between 1 and 2. > > However I also think this is the part that may be the most taxing on the db > server. > > Is there a better way?
If you use innodb, then yes, the better way is to use SELECT ... FOR UPDATE http://dev.mysql.com/doc/refman/5.1/en/innodb-locking-reads.html > > Can I grab the data and somehow 'lock' that row in one command, so another > process can't read it (it gets the next row) negating the need for order by > rand() ? > Then at the end of the process unlock it and update it with 'done' all in one > go? > > I'm not joking when I say that this is process is capable of bringing a Core > 2 > Duo with 4Gb of RAM to a crawl.... I am running MySQL 5.1.30. > > Michael > > > > --~--~---------~--~----~------------~-------~--~----~ NZ PHP Users Group: http://groups.google.com/group/nzphpug To post, send email to [email protected] To unsubscribe, send email to [email protected] -~----------~----~----~----~------~----~------~--~---
