Andy,
Thanks for the input. I'm going to look into that too.
Sounds reasonable and takes care of a few problems that popped up in my
head.
Thanks,
Anthony
----- Original Message -----
From: "Andy Dirnberger" <d...@dirnonline.com>
To: "NYPHP Talk" <talk@lists.nyphp.org>
Sent: Sunday, February 14, 2010 7:56 PM
Subject: Re: [nyphp-talk] Best way to accomplish this task
On Sun, Feb 14, 2010 at 8:49 PM, Anthony Papillion <papill...@gmail.com>
wrote:
Hello Everyone,
I'm designing a system that will work on a schedule. Users will submit
data for processing into the database and then, every minute, a PHP
script will pass through the db looking for unprocessed rows (marked
pending) and process them.
The problem is, I may eventually have a few million records to process at
a time. Each record could take anywhere from a few seconds to a few
minutes to perform the required operations on. My concern is making sure
that the script, on the next scheduled pass, doesn't grab the records
currently being processed and start processing them again.
Right now, I'm thinking of accomplishing this by updating a 'status'
field in the database. So unprocessed records would have a status of
'pending', records being processed would have a status of 'processing'
and completly processed record will have a status of 'complete'.
For some reason, I see this as ugly but that's the only way I can think
of making sure that records aren't duplicatly processed. So when I select
records to process, I'm ONLY selecting one's with the status of 'pending'
which means they are new, unprocessed.
Is there a better, more eleqent way of doing this or is this pretty much
it?
Thanks!
Anthony Papillion
I would generate a unique value for each pass (whether that's an auto
increment field in a table, a UUID, or something else is up to you). I
would add a field to your table(s), just like you were going to do
with status, to store this value.
When processing, do something along the lines of UPDATE table SET
new_field = unique_value WHERE new_field IS NULL (make sure new_field
is indexed). Then you process any records with a matching value in the
new field. This also has the benefit that, should a pass fail, you can
identify which records were part of that pass.
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
http://www.nyphp.org/Show-Participation
_______________________________________________
New York PHP Users Group Community Talk Mailing List
http://lists.nyphp.org/mailman/listinfo/talk
http://www.nyphp.org/Show-Participation