Re: Cron Job Question

2009-01-29 Thread Malcolm Tredinnick

On Thu, 2009-01-29 at 11:34 -0800, Chris wrote:
> I have a python / django  script that I have written which will be
> used as a cron. Basically this script goes out to my database and gets
> rows from a given table and performs a specific task then deletes the
> row once finished. What I would like to do is have multiple instances
> of this script running to check for new additions to this table every
> 15 minutes or so. I am not sure how I can divide up the work to be
> processed among all running scripts with out having issues.
> 
> Currently the script does an .objects.all() call and runs through the
> entire list but as mentioned would like to find a way to divide this
> up and with out another instance trying to process a row that is being
> processed by another instance.
> 
> The main question would be how I can tell the script to not process a
> given row b/c another instance of the script is currently processing
> the row or is in its queue to be processed.
> 
> It would also be nice to be to have the ability to add or remove
> instances of the script in times when the table is more or less full
> with rows to process.

In other words, you need a persistent queue. Put things into the queue,
pop the leading element for next processing. There are a lot of options
around. Here's a reasonable place to start:

http://code.google.com/p/queues/

Regards,
Malcolm



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---



Re: Cron Job Question

2009-01-29 Thread Jeff Anderson
Chris wrote:
> I have a python / django  script that I have written which will be
> used as a cron. Basically this script goes out to my database and gets
> rows from a given table and performs a specific task then deletes the
> row once finished. What I would like to do is have multiple instances
> of this script running to check for new additions to this table every
> 15 minutes or so. I am not sure how I can divide up the work to be
> processed among all running scripts with out having issues.
>   
I'm curious: why do you need multiple instances? The only reason I can
think of is that you need multiple machines to get the tasks done in a
timely manner. Is this the case?

The model that we use for a similar setup is the script will check the
database, process ALL unprocessed rows (no wait in between jobs), and
then when it's done, it waits a few seconds and checks again. This works
very well in production. Most of the time only one or two jobs will be
inserted at a time, but occasionally several hundreds or thousands of
jobs will be inserted. One instance of the worker script handles this
very well. We moved away from the cron job, and implemented a simple
daemon process.

If you really really really need the multiple instances, you could have
a worker script that launches the jobs one at a time, and will only
spawn as many as you need at once. I'd also suggest looking into
database locking or transactions if you want to have multiple scripts
with their fingers in the same table.

Cheers!

Jeff Anderson



signature.asc
Description: OpenPGP digital signature


Cron Job Question

2009-01-29 Thread Chris

I have a python / django  script that I have written which will be
used as a cron. Basically this script goes out to my database and gets
rows from a given table and performs a specific task then deletes the
row once finished. What I would like to do is have multiple instances
of this script running to check for new additions to this table every
15 minutes or so. I am not sure how I can divide up the work to be
processed among all running scripts with out having issues.

Currently the script does an .objects.all() call and runs through the
entire list but as mentioned would like to find a way to divide this
up and with out another instance trying to process a row that is being
processed by another instance.

The main question would be how I can tell the script to not process a
given row b/c another instance of the script is currently processing
the row or is in its queue to be processed.

It would also be nice to be to have the ability to add or remove
instances of the script in times when the table is more or less full
with rows to process.

Thanks in advance.
--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~--~~~~--~~--~--~---