On Sun, Feb 14, 2010 at 8:49 PM, Anthony Papillion <papill...@gmail.com> wrote:
>
> Hello Everyone,
>
> I'm designing a system that will work on a schedule. Users will submit data 
> for processing into the database and then, every minute, a PHP script will 
> pass through the db looking for unprocessed rows (marked pending) and process 
> them.
>
> The problem is, I may eventually have a few million records to process at a 
> time. Each record could take anywhere from a few seconds to a few minutes to 
> perform the required operations on. My concern is making sure that the 
> script, on the next scheduled pass, doesn't grab the records currently being 
> processed and start processing them again.
>
> Right now, I'm thinking of accomplishing this by updating a 'status' field in 
> the database. So unprocessed records would have a status of 'pending', 
> records being processed would have a status of 'processing' and completly 
> processed record will have a status of 'complete'.
>
> For some reason, I see this as ugly but that's the only way I can think of 
> making sure that records aren't duplicatly processed. So when I select 
> records to process, I'm ONLY selecting one's with the status of 'pending' 
> which means they are new, unprocessed.
>
> Is there a better, more eleqent way of doing this or is this pretty much it?
>
> Thanks!
> Anthony Papillion

I would generate a unique value for each pass (whether that's an auto
increment field in a table, a UUID, or something else is up to you). I
would add a field to your table(s), just like you were going to do
with status, to store this value.

When processing, do something along the lines of UPDATE table SET
new_field = unique_value WHERE new_field IS NULL (make sure new_field
is indexed). Then you process any records with a matching value in the
new field. This also has the benefit that, should a pass fail, you can
identify which records were part of that pass.
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk

http://www.nyphp.org/Show-Participation

Reply via email to