[jira] [Commented] (MAPREDUCE-205) Add ability to send "signals" to jobs and tasks

dixitsv (JIRA) Sat, 22 Oct 2016 15:21:07 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15598591#comment-15598591
 ]


dixitsv commented on MAPREDUCE-205:
-----------------------------------

I could use this feature, I have one use case similar to [~ab].  Devoid of that 
I am thinking of using a external KV store where I can poll for new 
instructions. Polling DFS @ higher frequency may not be scalable in very large 
cluster with thousands of tasks. <sidenote> I haven't tried DFS based soln. but 
I have bumped into namenode issues where thousands of mappers try to access 
same sufficiently replicated avro schema file </sidenote>





> Add ability to send "signals" to jobs and tasks
> -----------------------------------------------
>
>                 Key: MAPREDUCE-205
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-205
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Andrzej Bialecki 
>
> In some cases it would be useful to be able to "signal" a job and its tasks 
> about some external condition, or to broadcast a specific message to all 
> tasks in a job. Currently we can only send  a single pseudo-signal, that is 
> to kill a job.
> Example 1: some jobs may be gracefully terminated even if they didn't 
> complete all their work, e.g. Fetcher in Nutch may be running for a very long 
> time if it blocks on relatively few sites left over from the fetchlist. In 
> such case it would be very useful to send it a message requesting that it 
> discards the rest of its input and gracefully completes its map tasks.
> Example 2: available bandwidth for fetching may be different at different 
> times of day, e.g. daytime vs. nighttime, or total external link usage by 
> other applications. Fetcher jobs often run for several hours. It would be 
> good to be able to send a "signal" to the Fetcher to throttle or un-throttle 
> its bandwidth usage depending on external conditions.
> Job implementations could react to these messages either by implementing a 
> method, or by registering a listener, whichever seems more natural.
> I'm not quite sure how to go about implementing it, I guess this would have 
> to be a part of  TaskUmbilicalProtocol but my knowledge here is a bit fuzzy 
> ... ;) Comments are welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (MAPREDUCE-205) Add ability to send "signals" to jobs and tasks

Reply via email to