Github user mateiz commented on the pull request:

    https://github.com/apache/spark/pull/1313#issuecomment-50288916
  
    I see, my point was that we shouldn't try to make optimizations or major 
refactorings here. It makes it much harder to tell whether the code is correct, 
and this is code that's been tricky to debug, because it's split across two 
methods.
    
    I think you could make it much simpler as follows:
    - Don't add any extra parameters to resourceOffer
    - Call it with NO_PREFS after NODE_LOCAL
    - In resourceOffer, if the locality level passed in is NO_PREFS, don't 
adjust the locality (basically what your Boolean is doing, except that instead 
of being a Boolean, you just check whether the target level is NO_PREFS). (In 
fact if you'd like you can even have a separate top-level path in resourceOffer 
for NO_PREFS so that we don't have to have code dealing with it scattered 
throughout. That is, just make the call to NO_PREFS return *only* no-prefs 
tasks, not process-local or node-local.)
    
    The reason this is simpler is that every time you introduce parameters, 
people can pass in any combination of those parameters, and in fact in this 
code you only expect one particular combination. It's very hard for any new 
reader of the code to figure that out. The optimization with bottomLocality 
might be useful but it shouldn't be part of this bug fix PR -- this PR should 
just fix the bug. As I said before, I have my doubts that it will make a huge 
impact because I haven't seen this code be a bottleneck in profiles.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to