-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11124/#review21381
-----------------------------------------------------------



hadoop/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java
<https://reviews.apache.org/r/11124/#comment44317>

    There are races with your cancel() calls:
    
    1. jobUpdated has scheduler locked.
    2. The timer event is fired, and is now blocked on acquiring the 
synchronized lock in killTracker.
    3. jobUpdated cancels the timer.
    4. killTracker is called once jobUpdated releases the lock, and removes an 
already removed tracker.
    
    It seems you should be defensive against that case ignoring already removed 
trackers, or trackers that are considered active.



hadoop/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java
<https://reviews.apache.org/r/11124/#comment44314>

    Seems like you have a race here:
    
    if (!active) {
      // But now the scheduler class sets it to active!
      // This calls into killTracker regardless, killing an active TaskTracker.
      scheduler.killTracker(tracker);
    }
    
    Seems like this whole block can be synchronized on the scheduler?
    
    Also, any reason to not cancel the timer when trackers register?



hadoop/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java
<https://reviews.apache.org/r/11124/#comment44315>

    Any reason behind not using an Anonymous class here instead of declaring 
TrackerTimer?


- Ben Mahler


On June 3, 2013, 5:59 p.m., Brenden Matthews wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11124/
> -----------------------------------------------------------
> 
> (Updated June 3, 2013, 5:59 p.m.)
> 
> 
> Review request for mesos.
> 
> 
> Description
> -------
> 
> Kill tasks that never properly launch.
> 
> After trying to launch a task tracker, we'll wait up to 5 minutes before
> giving up and killing the task.
> 
> Review: https://reviews.apache.org/r/11124
> 
> 
> Diffs
> -----
> 
>   hadoop/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java 
> afe401f5265e3d9494af7eace42eec45943184a3 
> 
> Diff: https://reviews.apache.org/r/11124/diff/
> 
> 
> Testing
> -------
> 
> Used in production at airbnb.
> 
> 
> Thanks,
> 
> Brenden Matthews
> 
>

Reply via email to