Michael:
> I have 16 - 25 simultaneous processes running selecting records (to process) 
> from a DB that has about 3.3 million lines, quite standard-
> SELECT a,b WHERE x, y and z ORDER BY rand() LIMIT 1
> 
> Works fine up to about 8-10 processes, and then goes exponentially slower 
> until queries take over a minute to answer at around 20 processes. Note that 
> not all processes are hitting the DB at the same time, but they all tend to 
> come to a halt together by the time I am running 20 of them.
> 
> So this is what each process does-
> 
> 1. SELECT a,b where x,y,z and DONE=no.
> 
> 2. UPDATE a field to apply a 'lock' of sorts on that row, so another process 
> doesn't also do it.
> 
> do yada yada with the data (Can take a few seconds to a few minutes)
> 
> 3. UPDATE again, this time with a 'done' statement
> 
> Run loop again.
> 
> Now with every command I am running order by rand() - this is to greatly 
> minimise the chance of another process grabbing the same row in the period 
> between 1 and 2.
> 
> However I also think this is the part that may be the most taxing on the db 
> server.
> 
> Is there a better way?

If you use innodb, then yes, the better way is to use
SELECT ... FOR UPDATE

http://dev.mysql.com/doc/refman/5.1/en/innodb-locking-reads.html


> 
> Can I grab the data and somehow 'lock' that row in one command, so another 
> process can't read it (it gets the next row) negating the need for order by 
> rand() ?
> Then at the end of the process unlock it and update it with 'done' all in one 
> go?
> 
> I'm not joking when I say that this is process is capable of bringing a Core 
> 2 
> Duo with 4Gb of RAM to a crawl.... I am running MySQL 5.1.30.
> 
> Michael
> 
> > 
> 


--~--~---------~--~----~------------~-------~--~----~
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to [email protected]
To unsubscribe, send email to
[email protected]
-~----------~----~----~----~------~----~------~--~---

Reply via email to