Re: Bugs in 0.16.0?

Spiros Papadimitriou Mon, 03 Mar 2008 11:11:28 -0800

Thanks both for the reply.

I understand that the primary focus is long-running jobs (as least for now)
but that was not my question.  However, if someone wanted map-reduce on
shorter jobs (I would not go as far as calling them "real time" -- that's a
different story), I've found that Hadoop works pretty well, _except_ for
that bit there.  So, since I have very limited experience with Hadoop
internals, I was asking if that seems like a reasonable starting point to
accomodate short jobs as well -- or if there might be something more
important that I'm missing.


More specifically, we have some jobs that can be expressed as multiple
map-reduce passes over a not-so-big (i.e., O(10GB) but not O(TB++))
dataset.  We still scan TBs (so it's not really "real time" response that we
expect), but that's scanning 10-100x the same 10GB data with 10-100
map-reduce jobs -- for example, think k-means with each iteration == one
map-reduce job.  With a moderate number of nodes (say O(50)), each pass
takes under a minute.  However, if we have to pay a penalty of 5+ sec for
each pass, scalability suffers.  In some sense, this is no different that
the JVM penalty per TIP: I was merely pointing out that, in the setting I'm
considering, that sleep penalty is no longer paid once...

Yes -- you are correct, the primary problem was that sleep:
  if (numInFlight == 0 && numScheduled == 0) {
    // we should indicate progress as we don't want TT to think we're stuck
and kill us
    reporter.progress();
    Thread.sleep(5000);
  }
in ReduceTask.java.  BTW, I tried reducing this to 0.5sec and then found
other sleeps that were affecting scalability.  I'm suspecting the
JobClient#waitforCompletion() next, but I thought I'd better ask first about
removing those bandaids rather than keep changing them (to re-use the
metaphor :-).  If you grep for hardcoded sleeps, there are about 33 of them.

I hope this makes some sense...

Thanks!
Spiros

On Mon, Mar 3, 2008 at 12:43 PM, Amar Kamat <[EMAIL PROTECTED]> wrote:

> I guess the 5 sec you are talking about is when the shuffle phase has
> nothing to fetch. Seems like a heuristic to me. I guess its still there
> because no one raised any issue against it. Also that on an average
> there are lot of maps and hence the shuffle phase delay is governed by the
> network delay. Hence doesnt seem to be a big issue to me. Other sleeps are
> in
> the order of millisec (busy waiting). My comment was more on < 1 min jobs.
> But in general I think Adaptive delays will help. Feel free to raise an
> issue or comment on jira.
> Amar
>   On Mon, 3 Mar 2008, Ted Dunning wrote:
>
> >
> > Hard-coded delays in order to make a protocol work are almost never
> correct
> > in the long run.  This isn't a function of real-time or batch, it is
> simply
> > a matter of the fact that hard-coded delays don't scale correctly as
> problem
> > sizes/durations change.  *Adaptive* delays such a progressive back-off
> can
> > work correctly under scale changes, but *fixed* delays are almost never
> > correct.
> >
> > Delays may work as a band-aid in the short run, but eventually you have
> to
> > take the band-aid off.
> >
> >
> > On 3/3/08 8:46 AM, "Amar Kamat" <[EMAIL PROTECTED]> wrote:
> >
> >> HADOOP is not meant for real time applications. Its more or less
> designed
> >> for long running applications like crawlers/indexers.
> >> Amar
> >> On Mon, 3 Mar 2008, Spiros Papadimitriou wrote:
> >>
> >>> Hi
> >>>
> >>> I'd be interested to know if you've tried to use Hadoop for a large
> number
> >>> of short jobs.  Perhaps I am missing something, but I've found that
> the
> >>> hardcoded Thread.sleep() calls, esp. those for 5 seconds in
> >>> mapred.ReduceTask (primarily) and mapred.JobClient, cause more of a
> problem
> >>> than the 0.3 sec or so that it takes to fire up a JVM.
> >>>
> >>> Agreed that for long running jobs that is not a concern, but *if* we'd
> want
> >>> to speed things up for shorter running jobs  (say < 1 min) is a goal,
> then
> >>> JVM reuse would seem to be a lower priority?  Would doing something
> about
> >>> those sleep()s seem worthwhile?
> >
> >
>

Re: Bugs in 0.16.0?

Reply via email to