> On May 6, 2013, 8:40 p.m., Ben Mahler wrote: > > Awesome, thanks Brenden! There are currently cases in mesos where tasks are > > lost and the updates don't make it to the scheduler, is that what you were > > seeing? > > > > Are you able to do this with a java.util.Timer instead? We can schedule the > > kill operation on each launched task (you'll want to pass in the Driver as > > well). > > Brenden Matthews wrote: > I don't remember exactly what was happening with this one, but it sounds > like you summed it up correctly. There are a lot of edge cases, many of > which are beyond the control of Mesos, and this helps to catch some of those. > > Ben Mahler wrote: > This definitely makes the Hadoop Scheduler more robust! Let me know when > you've updated to use java.util.Timer to asynchronously kill the task. I can > provide more pointers if needed :)
Here's the new patch: http://ompldr.org/vaWNhaw/0015-Kill-tasks-that-never-properly-launch.patch All patches: http://ompldr.org/vaWNhag/patches.tar.bz2 - Brenden ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10931/#review20227 ----------------------------------------------------------- On May 7, 2013, 7:23 p.m., Brenden Matthews wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/10931/ > ----------------------------------------------------------- > > (Updated May 7, 2013, 7:23 p.m.) > > > Review request for mesos. > > > Description > ------- > > From 5edb18565e298f1225d46ea4c630567db7238a77 Mon Sep 17 00:00:00 2001 > From: Brenden Matthews <[email protected]> > Date: Thu, 2 May 2013 16:50:53 -0700 > Subject: [PATCH 15/19] Kill tasks that never properly launch. > > After trying to launch a task tracker, we'll wait up to 5 minutes before > giving up and killing the task. > --- > .../org/apache/hadoop/mapred/MesosScheduler.java | 33 > ++++++++++++++++++-- > 1 file changed, 31 insertions(+), 2 deletions(-) > > > Diffs > ----- > > hadoop/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java afe401f > > Diff: https://reviews.apache.org/r/10931/diff/ > > > Testing > ------- > > Used in production at airbnb. > > > Thanks, > > Brenden Matthews > >
