On Sun, Feb 14, 2010 at 8:49 PM, Anthony Papillion <papill...@gmail.com> wrote: > > Hello Everyone, > > I'm designing a system that will work on a schedule. Users will submit data > for processing into the database and then, every minute, a PHP script will > pass through the db looking for unprocessed rows (marked pending) and process > them. > > The problem is, I may eventually have a few million records to process at a > time. Each record could take anywhere from a few seconds to a few minutes to > perform the required operations on. My concern is making sure that the > script, on the next scheduled pass, doesn't grab the records currently being > processed and start processing them again. > > Right now, I'm thinking of accomplishing this by updating a 'status' field in > the database. So unprocessed records would have a status of 'pending', > records being processed would have a status of 'processing' and completly > processed record will have a status of 'complete'. > > For some reason, I see this as ugly but that's the only way I can think of > making sure that records aren't duplicatly processed. So when I select > records to process, I'm ONLY selecting one's with the status of 'pending' > which means they are new, unprocessed. > > Is there a better, more eleqent way of doing this or is this pretty much it? > > Thanks! > Anthony Papillion
I would generate a unique value for each pass (whether that's an auto increment field in a table, a UUID, or something else is up to you). I would add a field to your table(s), just like you were going to do with status, to store this value. When processing, do something along the lines of UPDATE table SET new_field = unique_value WHERE new_field IS NULL (make sure new_field is indexed). Then you process any records with a matching value in the new field. This also has the benefit that, should a pass fail, you can identify which records were part of that pass. _______________________________________________ New York PHP Users Group Community Talk Mailing List http://lists.nyphp.org/mailman/listinfo/talk http://www.nyphp.org/Show-Participation