Github user mccheah commented on the pull request:
https://github.com/apache/spark/pull/9164#issuecomment-149304490
Thanks for picking up some things we missed!
"In YarnAllocator.scala, line 447: what does "exited normally" mean here?
I'm hoping to improve the comment to be clearer to those less familiar with
YARN." - "exited normally" is a case when an executor would exit not because of
the user's code or the job itself, but because of events that occur during the
YARN application lifecycle. In other words, the executor does not exit because
of problems, per se, but exits in "normal" operation. The definition of
"normal" of course is still vague.
"In SparkDeploySchedulerBackend, the original commit seems to have changed
the behavior..." - this might unfortunately be a bug in how I refactored
things. How would this manifest? @kayousterhout , would it be possible for you
to reproduce it and fix if necessary?
Changing JsonProtocol seems fine.
"For the case where the commit failed, we never count that towards the max
failures. Is there any legimate reason the commit could fail? I'm wondering if
it's possible to have a stage that fails infinitely many times due to a real
problem with committing the task." - the TaskCommitDenied failed event is
explicitly coordinated by the driver, and is fired only if the driver saw that
a previous copy of this task had already started committing the output before
this one. This is relevant in speculative execution mode. TaskCommitDenied
isn't fired if the commit itself fails however. See
SparkHadoopMapRedUtil.commitTask.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]