[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

Todd Lipcon (JIRA) Tue, 09 Feb 2010 12:17:52 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831644#action_12831644
 ]


Todd Lipcon commented on MAPREDUCE-1266:
----------------------------------------

bq. if you are using jvm reuse, then that 1s disappears, right? 

Not really, since JVM reuse doesn't reuse between maps and reduces.

The time sequence of a small job looks like:

Client:
  Submit job
JT:
  Create tasks ("initialize job") on JT
  wait for a TT to heartbeat
TT:
  start JVM
child:
  process map task
TT:
  send accelerated heartbeat once map task is complete (I forget whether this 
is in 0.20 or came later)
  receive reduce task, start reduce JVM (regardless of JVM reuse)
child:
  process reduce task
TT:
  send completion heartbeat

I guess there are also some setup/cleanup tasks going on in there as well. 
Since we're talking about a hypothetical one map, one reduce, we're just 
cutting down the time between initting the job and getting the first JVM on a 
TT.

In a multimapper or multireducer job, the cost shows up in how long it takes 
for all of the tasks to get scheduled - it will only schedule one task per 
heartbeat with some schedulers. The fair scheduler after MAPREDUCE-706 can 
assign multiple at the same time, which should help substantially.

> Allow heartbeat interval smaller than 3 seconds for tiny clusters
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1266
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, task, tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Minor
>
> For small clusters, the heartbeat interval has a large effect on job latency. 
> This is especially true on pseudo-distributed or other "tiny" (<5 nodes) 
> clusters. It's not a big deal for production, but new users would have a 
> happier first experience if Hadoop seemed snappier.
> I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
> 0.5 seconds (but have it governed by an undocumented config parameter in case 
> people don't like this change). The cluster size-based ramp up of interval 
> will maintain the current scalable behavior for large clusters with no 
> negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

Reply via email to