Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4055#discussion_r33967596
  
    --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
    @@ -305,7 +305,7 @@ private[spark] class MapOutputTrackerMaster(conf: 
SparkConf)
     
         if (mapStatuses.contains(shuffleId)) {
           val statuses = mapStatuses(shuffleId)
    -      if (statuses.nonEmpty) {
    +      if (statuses.nonEmpty && statuses.exists(_ != null)) {
    --- End diff --
    
    hmm, I think I see what you mean, but this seems totally separate to the 
main issue.  I have run into something similar when working on other 
`DAGScheduler` fixes, but it seems like mostly the preconditions of this method 
are violated by the `DAGScheduler` which is why we end up with this exception.
    
    Maybe it should be fixed in any case, just to have better error messages.  
@shivaram do you want to take a look since you added this method?  As 
@suyanNone notes, the original `if (statuses.nonEmpty)` is a pointless test, 
its guaranteed to be non-empty the way data is inserted into `mapStatuses`.
    
    I also don't think adding `statuses.exists(_ != null)` is the right fix -- 
that just checks that there is *some* non-null status, but if one status is 
null you'll still get an NPE on `status.getSizeForBlock(reducerId)`.  We could 
change to `statuses.forall(_ != null)` ... but I'm still not sure what the 
right behavior should be if there is a `null` status.  Should this method just 
return `None`?  Or does it make more sense for it throw an exception, that a 
precondition has been violated?
    
    overall I do think this is not the main point of this issue, and if we do 
want to make a change, it should probably be in a separate issue (just b/c 
`DAGScheduler` changes are so hard to deal w/ its best to keep the PRs as 
limited in scope as possible.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to