Great. Let's start with this:
http://www.amazon.com/Simple-Queue-Service-home-page/b?ie=UTF8&node=13584001 Just the basics. The way SQS works is: - you define a "queue" that has a name - you add "tasks" to the queue. These are really just small documents that your workers will understand. - a worker atomically removes an item from the head of the queue. This item will not be completely deleted, but rather will be put in a holding pen for a period of time after which it will be returned to the queue. - if the worker finishes work on the item, it deletes the item from the queue or the holding pen depending on whether the timeout expired. - if the worker dies before signaling completion of work on the task, the task will eventually be returned to the queue and handed out to another worker. - the worker is responsible for accessing any specified input resources, saving any results and scheduling any follow-on work. - there is potential for a race condition when additional work is to be scheduled. If the scheduling is done before deleting the item, then there is a thin possibility that the item would have been passed out again. If the scheduling is done after the item is deleted, the worker could crash and lose the item. I think the best way to avoid problems is to have workers check for the existence of a completion flag before starting work and before saving results. This makes double processing non-fatal. On 12/21/07 12:51 PM, "Arun C Murthy" <[EMAIL PROTECTED]> wrote: > On Fri, Dec 21, 2007 at 12:43:38PM -0800, Ted Dunning wrote: >> >> * if you need some kind of work-flow, hadoop won't help (but it won't hurt >> either) >> > > Lets start a discussion around this, seems to be something lots of folks could > use...