[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203473#comment-17203473
]
Eric Payne commented on YARN-9809:
--
I have committed this to branch-3.3 and branch-3.2. It looks like
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203346#comment-17203346
]
Eric Payne commented on YARN-9809:
--
The latest branch-3.2 precommit build looks fine. The unit test
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202448#comment-17202448
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202390#comment-17202390
]
Eric Payne commented on YARN-9809:
--
Version 009 LGTM. +1
> NMs should supply a health status when
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202388#comment-17202388
]
Eric Payne commented on YARN-9809:
--
Thanks a lot, [~ebadger] for the backport, and thank you
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202352#comment-17202352
]
Jim Brennan commented on YARN-9809:
---
Thanks for fixing the test [~ebadger]!
+1 for
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202314#comment-17202314
]
Eric Badger commented on YARN-9809:
---
So close. Those pesky unit tests. Patch 009 fixes the unit test
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202289#comment-17202289
]
Jim Brennan commented on YARN-9809:
---
Thanks [~ebadger] for the updated patch! I am +1 on
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201900#comment-17201900
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201814#comment-17201814
]
Eric Badger commented on YARN-9809:
---
I've attached branch-3.2 patch 008 to address your comments,
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201782#comment-17201782
]
Eric Badger commented on YARN-9809:
---
{noformat}
RMNodeImpl#AddNodeTransition#transition
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201136#comment-17201136
]
Jim Brennan commented on YARN-9809:
---
I finished a first pass. Here are my comments:
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201116#comment-17201116
]
Eric Badger commented on YARN-9809:
---
Thanks for the initial reviews, [~epayne] and [~Jim_Brennan]! I
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201098#comment-17201098
]
Jim Brennan commented on YARN-9809:
---
Thanks [~ebadger] for putting up a branch-3.2 patch! I am still
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201061#comment-17201061
]
Eric Payne commented on YARN-9809:
--
Thanks a lot [~ebadger] for putting upt the 3.2 backport patch. I'm
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199792#comment-17199792
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199707#comment-17199707
]
Eric Badger commented on YARN-9809:
---
[~epayne], [~Jim_Brennan], sorry for the delay. I have put up a
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190724#comment-17190724
]
Jim Brennan commented on YARN-9809:
---
No objection to backporting.
> NMs should supply a health status
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190414#comment-17190414
]
Eric Payne commented on YARN-9809:
--
[~ebadger], this doesn't backport cleanly to 3.2. Would you mind
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190392#comment-17190392
]
Eric Payne commented on YARN-9809:
--
Unless there are objections, I would like to backport this to 3.1.
>
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148914#comment-17148914
]
Hudson commented on YARN-9809:
--
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18394 (See
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148906#comment-17148906
]
Eric Badger commented on YARN-9809:
---
Thanks, [~eyang] for the review and commit and [~Jim_Brennan] for
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148829#comment-17148829
]
Eric Badger commented on YARN-9809:
---
Thanks for the review, [~eyang]! Are you planning on committing
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148066#comment-17148066
]
Eric Yang commented on YARN-9809:
-
+1 for patch 007. Tested both healthy and unhealthy health check
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146522#comment-17146522
]
Eric Badger commented on YARN-9809:
---
Thanks, [~Jim_Brennan]! [~eyang], would you take another look?
>
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17146402#comment-17146402
]
Jim Brennan commented on YARN-9809:
---
Thanks for the update [~ebadger]!
I am +1 (non-binding) on patch
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17145981#comment-17145981
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17145904#comment-17145904
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17145893#comment-17145893
]
Eric Badger commented on YARN-9809:
---
Good catch, [~Jim_Brennan]. {{updateMetricsForRejoinedNode()}} is
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17145869#comment-17145869
]
Jim Brennan commented on YARN-9809:
---
Thanks for the updates [~ebadger]! I have one comment on the new
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17145717#comment-17145717
]
Eric Badger commented on YARN-9809:
---
The TestFairScheduler and TestFairSchedulerPreemption test failures
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17145711#comment-17145711
]
Eric Badger commented on YARN-9809:
---
Patch 006 moves
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144632#comment-17144632
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17144544#comment-17144544
]
Eric Badger commented on YARN-9809:
---
Thanks for the review, [~Jim_Brennan]! I've uploaded patch 005 to
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140841#comment-17140841
]
Jim Brennan commented on YARN-9809:
---
Thanks for the patch [~ebadger]! Overall I think the design and
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139516#comment-17139516
]
Jim Brennan commented on YARN-9809:
---
I have started reviewing the patch, but I will need more time. I
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139511#comment-17139511
]
Eric Yang commented on YARN-9809:
-
[~ebadger] [~Jim_Brennan] I agree that health check script handling is
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139451#comment-17139451
]
Jim Brennan commented on YARN-9809:
---
[~eyang], [~ebadger] changing the behavior of health-check scripts
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138899#comment-17138899
]
Eric Badger commented on YARN-9809:
---
I can see pros and cons to both approaches. On the one hand, if the
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138894#comment-17138894
]
Eric Yang commented on YARN-9809:
-
[~ebadger] Sorry, my statement was not clear. If the script name is
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138892#comment-17138892
]
Eric Badger commented on YARN-9809:
---
{noformat:title=NodeHealthScriptRunner.newInstance()}
if
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138853#comment-17138853
]
Eric Yang commented on YARN-9809:
-
[~Jim_Brennan] Thank you for the instruction. I updated my check
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138833#comment-17138833
]
Jim Brennan commented on YARN-9809:
---
[~eyang] I believe the health check script output must contain a
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17138685#comment-17138685
]
Eric Yang commented on YARN-9809:
-
[~ebadger] Thank you for the patch. The patch looks very close to
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17137934#comment-17137934
]
Eric Badger commented on YARN-9809:
---
{noformat}
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136194#comment-17136194
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136095#comment-17136095
]
Eric Badger commented on YARN-9809:
---
Patch 004 fixes checkstyle. There is still the javac error with
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134646#comment-17134646
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128817#comment-17128817
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128686#comment-17128686
]
Eric Badger commented on YARN-9809:
---
Attaching patch 002 to address unit test failures
> NMs should
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127215#comment-17127215
]
Hadoop QA commented on YARN-9809:
-
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem ||
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127025#comment-17127025
]
Eric Badger commented on YARN-9809:
---
Patch 001 adds the feature but makes it opt-in via the config
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110361#comment-17110361
]
Jim Brennan commented on YARN-9809:
---
[~ccondit] I agree that a config to allow one to opt-in to this
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110356#comment-17110356
]
Craig Condit commented on YARN-9809:
Since health check scripts are by nature different for every
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108617#comment-17108617
]
Eric Yang commented on YARN-9809:
-
[~Jim_Brennan] This feature is a great addition to make admin task
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108576#comment-17108576
]
Jim Brennan commented on YARN-9809:
---
I would like to revive this discussion. We have this implemented
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922542#comment-16922542
]
Eric Badger commented on YARN-9809:
---
bq. Although it is good to have a way to prevent scheduling
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921761#comment-16921761
]
Eric Yang commented on YARN-9809:
-
[~ebadger] LocalDirsHandlerService checkDir is a timer task. It is low
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921754#comment-16921754
]
Eric Badger commented on YARN-9809:
---
bq. It is unlikely to determine unhealthy status until at least one
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921751#comment-16921751
]
Eric Yang commented on YARN-9809:
-
[~ebadger] It is unlikely to determine unhealthy status until at least
[
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921741#comment-16921741
]
Eric Badger commented on YARN-9809:
---
I propose adding a health status field to the NM-RM registration
61 matches
Mail list logo