Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/9946
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-175092963
Can you make a comment to give a summary of the discussion? I am not sure
that the commit message has the enough info.
---
If your project is set up for it, you can
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-175121155
Sure.
A brief summary is that: (picked a few conclusive points from above):
"At the point we call System.exit here all user code is done and we are
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173644365
so I haven't seen any use cases this would break.
I would argue is they are relying on this behavior its a bug in the user
code. They should be using shut
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173645684
I don't think this is quite addressing my point, but at best I'm asserting
there's a choice between which "bad" behavior you want to deal with, not that
either behavior
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173664570
Sure, but it becomes much more likely to bite if you always kill the
threads immediately, if they're still running, rather than have them killed a
little bit later by
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173666801
> Are you saying YARN immediately kills the container here?
I'm saying that it's the user's fault if his application depends on that
non-predictable behavior.
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173660845
Sorry Sean, I don't see how we're not addressing your comment. Without the
change, the behavior you're concerned about *already exists*, because YARN
kills containers
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173671561
@zhuoliu please re-open this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173691624
I'm on board with this if it's true that YARN virtually immediately kills a
JVM like this if it's not done by the time the NM thinks it's done. Then indeed
regardless of
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173702746
+1. I'll give this a bit for others to look at and make sure we are done
discussing.
---
If your project is set up for it, you can reply to this email and have your
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173703463
I still would like to see a better commit message (the link to JIRA is
implied by the title so is redundant in a commit message).
---
If your project is set up for it,
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173706813
Hi @vanzin , do we want to amend the commit message to something like this?
"Call system.exit explicitly to make sure non-daemon user threads
terminate.
Without
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173733049
Huh, no. You have to edit the very first comment on *this page*, not fix
the commit message on your github branch.
---
If your project is set up for it, you can reply
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173737599
Great, thanks! LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173737937
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173737934
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173732459
Thanks @vanzin , commit message updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173737073
Sorry for that. Just updated the first comment and changed the commit
message back to original.
---
If your project is set up for it, you can reply to this email and
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173740609
**[Test build #49891 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49891/consoleFull)**
for PR 9946 at commit
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173707549
@zhuoliu that sounds great. Just edit your very first comment (that's the
commit message).
---
If your project is set up for it, you can reply to this email and have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173761688
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173761686
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173761532
**[Test build #49891 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49891/consoleFull)**
for PR 9946 at commit
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173686628
Now reopen.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user zhuoliu reopened a pull request:
https://github.com/apache/spark/pull/9946
[SPARK-10911] Executors should System.exit on clean shutdown.
https://issues.apache.org/jira/browse/SPARK-10911
You can merge this pull request into a Git repository by running:
$ git pull
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173299141
Also note System.exit still goes through the normal JVM shutdown hooks,
etc.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173273526
@zhuoliu OK though you will have to close it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173298943
Just to point out here I may re-open this. I would still rather fix this,
a known bug that I can reproduce then worry about a theoretical it might break
something.
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173300190
I'm not against this change; in fact, this is already the current behavior
in healthy YARN installs (since YARN will kill your container). I just wanted
the change
Github user zhuoliu closed the pull request at:
https://github.com/apache/spark/pull/9946
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173274389
Thanks, closed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173347224
@andrewor14 @srowen thoughts or concerns?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173399004
> It should ideally be so, but, what happens when it isn't?
What happens is that the user code will die anyway because YARN will kill
the container. The explicit
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173402547
Actually, what's the difference between letting `main` complete normally
(in which case `java` already exits with status 0 right?) and exiting
explicitly at the end of
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173397974
I disagree with: "At the point we call System.exit here all user code is
done and we are terminating" It should ideally be so, but, what happens when it
isn't? A user
Github user zhuoliu commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173268512
@vanzin @srowen @andrewor14 , I think your concerns make sense. Also,
adding timeout might have another issue that resources cannot be released
immediately even if
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173404935
> You still get shutdown hooks but System.exit kills the non-daemon threads?
You get shutdown hooks when the JVM exits (nor matter how, except `kill -9`
of
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173414518
I'm not suggesting that behavior is correct, but equally you could say an
app that never terminates is doing something wrong or a YARN that doesn't stop
it is doing
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-173415101
> a YARN that doesn't stop it is doing something wrong
But that's the case. It's a bug in YARN. See
https://issues.apache.org/jira/browse/HADOOP-12441.
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-164602831
I agree with @srowen. Just ending the executor can pose regressions in
behavior that are difficult to debug. If the concern is that executor processes
are undying,
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-161754892
@vanzin So the bug is one thing it can actually happen in other ways too.
For instance if someone messes up the node while upgrading the NM during
rolling upgrade.
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-161758624
Sure, it's more for compleness; explaining why you need to explicitly exit,
since I'd expect the cluster manager to kill those containers when the app
finally finishes,
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-161753715
@zhuoliu given the discussion in SPARK-10911, could you write a proper PR
description and reference HADOOP-12441, which is why this is needed on the YARN
side?
---
If
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-161308748
Well the users shouldn't have been using non-daemon threads in the first
place, they spawned them while a task was running and then never cleaned them
up. We asked
Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-161106779
@srowen do you know of any actual use cases this will break? We've
finished running the user code and are exiting anyway so things should just be
shutting down. At
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-16303
But isn't the scenario here that the user app isn't done, because it
spawned non-daemon threads that are doing something? I agree it's not good
practice, but if apps
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159542573
See the JIRA. I still don't believe this is safe. At least, causes more
problems than it seems to solve.
---
If your project is set up for it, you can reply to this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159443182
**[Test build #46627 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46627/consoleFull)**
for PR 9946 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159443343
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159443345
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159416698
Jenkins, this is ok to test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
GitHub user zhuoliu opened a pull request:
https://github.com/apache/spark/pull/9946
[SPARK-10911] Executors should System.exit on clean shutdown.
https://issues.apache.org/jira/browse/SPARK-10911
You can merge this pull request into a Git repository by running:
$ git pull
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159409786
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/9946#issuecomment-159417588
**[Test build #46627 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46627/consoleFull)**
for PR 9946 at commit
55 matches
Mail list logo