Github user kayousterhout commented on the pull request:

    https://github.com/apache/spark/pull/2746#issuecomment-59280725
  
    @andrewor14 this looks cool!  Two high level comments:
    
    (1) Right now, there are a bunch of different kinds of timers that can be 
set and then triggered later.  Is it possible to do something much simpler, 
where you just have a single timer, and when the timer goes off, you check all 
of the state to see what should happen next? For example, for executor removal, 
you could have a map of executors to when they were last used (or sorted list 
based on when they were last used?) and remove any that haven't been used for a 
threshold amount of time.  As I read the code, I was wondering if this would 
significantly simplify a bunch of the code, with negligible performance impact 
(?).
    
    (2) I'm a little concerned about the changes to the scheduler code.  In 
general, the scheduler has evolved to this point where the interface between 
the various scheduler components is very messy, and it's really hard to 
describe (does anyone even know?!) what happens in which part of the scheduler. 
 The way you've done this right now adds some logic in the task set manager (to 
determine when something has changed on the level of the task set, in which 
case the task scheduler needs to be notified) and to the task scheduler 
implementation, to determine when something has changed across all task sets, 
in which case it needs to notify the Allocation Manager.
    
    Based on a cursory look, it looks like potentially *all* of the logic added 
to the task scheduler impl could instead be done in the Allocation Manager (by 
moving all of the data structures you added in lines 118 - 132 to the 
Allocation Manager).  I'd prefer this because then the Allocation Manager can 
have a very narrow interface that shields the scheduler from a lot of 
complexity.
    
    In this case, you could pass a reference to Allocation Manager to each TSM, 
which would call a "newPendingTask" method on the Allocation Manager (for 
example).  Similar to with the TaskSchedulerImpl, I think most of the 
functionality in the TSM related to the allocation manager code could be pushed 
into the allocation manager, so that the TSM doesn't have to handle any of the 
complexity of tracking new information itself -- instead, it would just call a 
method on the Allocation Manager when tasks get started / finish.
    
    One potential issue I see with pushing all of the functionality from the 
TSM into the allocation manager is that right now, the TSM can drop some 
messages from getting sent back through the scheduler / allocation manager -- 
like when a task finishes on an executor, but there are other tasks for the 
task set still running there, so the allocation manager doesn't need to know 
about it.  I'd guess this isn't a huge performance hit -- the complexity of all 
of this logic is still O(number of tasks being scheduled) and is merited by the 
simplicity benefit of having the AllocationManager do everything.
    
    At the end of the day, it seems like the AllocationManager needs to know 
when a task starts and when one finishes.  So, can it just implement the 
listener interface, and the scheduler doesn't directly interact with it at all? 
 It looks like you mentioned this idea in your design doc, so I'm wondering if 
you stumbled upon some problem when you tried doing this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to