On Fri, Dec 21, 2007 at 01:05:47PM -0800, Ted Dunning wrote: > >Great. > >Let's start with this: > >http://www.amazon.com/Simple-Queue-Service-home-page/b?ie=UTF8&node=13584001 >
Thanks Ted. I've opened http://issues.apache.org/jira/browse/HADOOP-2484 to capture this thread, do you mind adding a comment there about SQS? Btw, another important piece I forgot to mention in my previous post: http://lucene.apache.org/hadoop/docs/r0.15.1/mapred_tutorial.html#JobControl Arun >Just the basics. > >The way SQS works is: > >- you define a "queue" that has a name > >- you add "tasks" to the queue. These are really just small documents that >your workers will understand. > >- a worker atomically removes an item from the head of the queue. This item >will not be completely deleted, but rather will be put in a holding pen for >a period of time after which it will be returned to the queue. > >- if the worker finishes work on the item, it deletes the item from the >queue or the holding pen depending on whether the timeout expired. > >- if the worker dies before signaling completion of work on the task, the >task will eventually be returned to the queue and handed out to another >worker. > >- the worker is responsible for accessing any specified input resources, >saving any results and scheduling any follow-on work. > >- there is potential for a race condition when additional work is to be >scheduled. If the scheduling is done before deleting the item, then there >is a thin possibility that the item would have been passed out again. If >the scheduling is done after the item is deleted, the worker could crash and >lose the item. I think the best way to avoid problems is to have workers >check for the existence of a completion flag before starting work and before >saving results. This makes double processing non-fatal. > > > >On 12/21/07 12:51 PM, "Arun C Murthy" <[EMAIL PROTECTED]> wrote: > >> On Fri, Dec 21, 2007 at 12:43:38PM -0800, Ted Dunning wrote: >>> >>> * if you need some kind of work-flow, hadoop won't help (but it won't hurt >>> either) >>> >> >> Lets start a discussion around this, seems to be something lots of folks >> could >> use... >