Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/4055#discussion_r33967596
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -305,7 +305,7 @@ private[spark] class MapOutputTrackerMaster(conf:
SparkConf)
if (mapStatuses.contains(shuffleId)) {
val statuses = mapStatuses(shuffleId)
- if (statuses.nonEmpty) {
+ if (statuses.nonEmpty && statuses.exists(_ != null)) {
--- End diff --
hmm, I think I see what you mean, but this seems totally separate to the
main issue. I have run into something similar when working on other
`DAGScheduler` fixes, but it seems like mostly the preconditions of this method
are violated by the `DAGScheduler` which is why we end up with this exception.
Maybe it should be fixed in any case, just to have better error messages.
@shivaram do you want to take a look since you added this method? As
@suyanNone notes, the original `if (statuses.nonEmpty)` is a pointless test,
its guaranteed to be non-empty the way data is inserted into `mapStatuses`.
I also don't think adding `statuses.exists(_ != null)` is the right fix --
that just checks that there is *some* non-null status, but if one status is
null you'll still get an NPE on `status.getSizeForBlock(reducerId)`. We could
change to `statuses.forall(_ != null)` ... but I'm still not sure what the
right behavior should be if there is a `null` status. Should this method just
return `None`? Or does it make more sense for it throw an exception, that a
precondition has been violated?
overall I do think this is not the main point of this issue, and if we do
want to make a change, it should probably be in a separate issue (just b/c
`DAGScheduler` changes are so hard to deal w/ its best to keep the PRs as
limited in scope as possible.)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]