Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/7276#issuecomment-123487912
  
    i took a medium-level pass, not too detailed and overall approach into the 
event loop looks. However I have some high level comments. 
    
    1. In the code refactoring, why did you move all the `registerReceiver`, 
`deregisterReceiver`, etc. into the Endpoint class? That moves around a lot of 
code, making it hard to review. Also it makes the overall structure more 
complex - some functionality is in the parent ReceiverTracker class and some 
are in the EndPoint class, hard to reason about what goes where in the future. 
Instead the earlier structure had the goal of all the functionality is in the 
ReceiverTracker, and the Endpoint class was just to going to be the event loop 
that accesses that functionality. I think its more structured if that is 
maintained.
    
    2. I think the scheduling policy needs to be thought through. It may be in 
the case we need two steps, one of bulk scheduling in the beginning, which 
tries to distribute it properly (n recvrs in n execs) by setting preferred 
locations on all of them. Then for every receiver that needs to be restarted, 
they are scheduled one at a time. The schedulingPolicy can be changed to take 
an array of receiverIds to accommodate this, initially all the receiverIds are 
sent, and the locations of all of them returned; then one at a time. How does 
that sound?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to