Re: [Launchpad-dev] Design pattern for interruptible processing of continuous data

Clint Byrum Tue, 04 Jan 2011 10:25:05 -0800

On Tue, 2011-01-04 at 15:29 +0000, Julian Edwards wrote:
> Dear all
> 
> I've seen this problem pop up in similar ways a few times now, where we're 
> processing a bunch of data in a cron job (whether externally on the API, or 
> internally) and it needs to do a batch of work, remember where it left off 
> (whether reaching a batch limit or the live data is paused), and continue 
> later.
> 
> Typically to solve this, the client processing the data stores some piece of 
> context about the data it's processing, and uses that data to re-start from 
> the right place next time.
> 
> I think it would be a good idea to formalise a design around this in such a 
> way that will also be beneficial to us when we eventually start using a 
> message queuing application.
> 
> In a previous life, the context data that I've used for this is a timestamp, 
> and it worked very well in pretty much all cases I came across.  The client 
> application simply provides the same timestamp to a query/api call from the 
> last item it processed, and the data continues to flow from where it left 
> off.  
> This ticked all the boxes for data integrity and polling or streaming usage.
>



I'm curious why one can't just start using message queues on the batch
job only.

Rather than a cron job that does all the work, the batch job could
simply push all the work into a queue. Whenever the message queue is
ready for frontend consumption, the batch jobs go away and the frontend
starts feeding the backend directly.

Trying to emulate the queue's robustness seems a noble, but possibly
unnecessary effort if queues are coming any time soon.


_______________________________________________
Mailing list: https://launchpad.net/~launchpad-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~launchpad-dev
More help   : https://help.launchpad.net/ListHelp

Re: [Launchpad-dev] Design pattern for interruptible processing of continuous data

Reply via email to