[jira] [Updated] (YARN-10751) Add document for yarn log aggregation policies.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10751: -- Description: As discussed in 

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331141#comment-17331141 ] Qi Zhu commented on YARN-10743: --- Thanks [~Jim_Brennan] for commit. I created a following Jira  YARN-10751 

[jira] [Created] (YARN-10751) Add document for yarn log aggregation policies.

2021-04-23 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10751: - Summary: Add document for yarn log aggregation policies. Key: YARN-10751 URL: https://issues.apache.org/jira/browse/YARN-10751 Project: Hadoop YARN Issue Type:

[jira] [Updated] (YARN-10749) Can't remove all node labels after add node label without nodemanager port, broken by YARN-10647

2021-04-23 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10749: --- Fix Version/s: 3.2.3 2.10.2 3.1.5 3.3.1

[jira] [Commented] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331028#comment-17331028 ] Szilard Nemeth commented on YARN-10123: --- Thanks [~sahuja] for working on this. Latest patch LGTM,

[jira] [Comment Edited] (YARN-7769) FS QueueManager should not create default queue at init

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-7769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331020#comment-17331020 ] Szilard Nemeth edited comment on YARN-7769 at 4/23/21, 8:37 PM: Hi

[jira] [Commented] (YARN-7769) FS QueueManager should not create default queue at init

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-7769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331020#comment-17331020 ] Szilard Nemeth commented on YARN-7769: -- Hi [~bteke], I also checked the code, the latest patch looks

[jira] [Comment Edited] (YARN-10555) missing access check before getAppAttempts

2021-04-23 Thread lujie (Jira)
[ https://issues.apache.org/jira/browse/YARN-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17264001#comment-17264001 ] lujie edited comment on YARN-10555 at 4/23/21, 7:43 PM: again ping->

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330998#comment-17330998 ] Jim Brennan commented on YARN-10743: [~zhuqi] thanks for updating the patch. I agree it would be

[jira] [Commented] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types

2021-04-23 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330990#comment-17330990 ] Hadoop QA commented on YARN-10691: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330973#comment-17330973 ] Hadoop QA commented on YARN-10743: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem

[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330972#comment-17330972 ] Hadoop QA commented on YARN-10732: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-10123) Error message around yarn app -stop/start can be improved to highlight that an implementation at framework level is needed for the stop/start functionality to work

2021-04-23 Thread Benjamin Teke (Jira)
[ https://issues.apache.org/jira/browse/YARN-10123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330947#comment-17330947 ] Benjamin Teke commented on YARN-10123: -- [~sahuja] thanks for working on this. The change looks good

[jira] [Commented] (YARN-9594) Fix missing break statement in ContainerScheduler#handle

2021-04-23 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330918#comment-17330918 ] Jim Brennan commented on YARN-9594: --- [~xiaoheipangzi] thanks for fixing this. I took the liberty of

[jira] [Updated] (YARN-9594) Fix missing break statement in ContainerScheduler#handle

2021-04-23 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-9594: -- Fix Version/s: 2.10.2 > Fix missing break statement in ContainerScheduler#handle >

[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330917#comment-17330917 ] Peter Bacsko commented on YARN-10732: - [~BilwaST] thanks for your comment - I think this is a

[jira] [Comment Edited] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Bilwa S T (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330908#comment-17330908 ] Bilwa S T edited comment on YARN-10732 at 4/23/21, 4:51 PM: [~gandras]

[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Bilwa S T (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330908#comment-17330908 ] Bilwa S T commented on YARN-10732: -- [~gandras] [~pbacsko] As part of YARN-10260 transitioning from

[jira] [Updated] (YARN-10691) DominantResourceCalculator isInvalidDivisor should consider only countable resource types

2021-04-23 Thread Bilwa S T (Jira)
[ https://issues.apache.org/jira/browse/YARN-10691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-10691: - Attachment: YARN-10691.001.patch > DominantResourceCalculator isInvalidDivisor should consider only

[jira] [Commented] (YARN-10705) Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330872#comment-17330872 ] Peter Bacsko commented on YARN-10705: - Thanks for the patch [~sahuja], committed to trunk. >

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330871#comment-17330871 ] Qi Zhu commented on YARN-10743: --- Thanks [~Jim_Brennan] for review. Fixed checkstyle and add document in 

[jira] [Updated] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10743: -- Attachment: YARN-10743.003.patch > Add a policy for not aggregating for containers which are killed because >

[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330865#comment-17330865 ] Peter Bacsko commented on YARN-10732: - [~gandras] the old queue state comes from a {{CSQueueStore}}

[jira] [Commented] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330859#comment-17330859 ] Peter Bacsko commented on YARN-10732: - I manually triggered a build and set the status to "Patch

[jira] [Assigned] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reassigned YARN-10732: --- Assignee: Andras Gyori (was: Peter Bacsko) > Disallow restarting a queue while it is in

[jira] [Assigned] (YARN-10732) Disallow restarting a queue while it is in DRAINING state on CS reinitialization

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reassigned YARN-10732: --- Assignee: Peter Bacsko (was: Andras Gyori) > Disallow restarting a queue while it is in

[jira] [Commented] (YARN-10705) Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler

2021-04-23 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330857#comment-17330857 ] Peter Bacsko commented on YARN-10705: - +1 LGTM. > Misleading DEBUG log for container assignment

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330834#comment-17330834 ] Jim Brennan commented on YARN-10743: Thanks for the patch [~zhuqi]! The code looks good to me. Can

[jira] [Commented] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330811#comment-17330811 ] Szilard Nemeth commented on YARN-10750: --- Thanks [~shuzirra] for reporting this, good catch! As per

[jira] [Assigned] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth reassigned YARN-10750: - Assignee: Gergely Pollak > TestMetricsInvariantChecker.testManyRuns is broken since

[jira] [Commented] (YARN-10746) RmWebApp add default-node-label-expression to the queue info

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330806#comment-17330806 ] Szilard Nemeth commented on YARN-10746: --- Thanks [~shuzirra] for working on this, Latest patch LGTM,

[jira] [Commented] (YARN-10654) Dots '.' in CSMappingRule path variables should be replaced

2021-04-23 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330800#comment-17330800 ] Szilard Nemeth commented on YARN-10654: --- Thanks [~pbacsko] for working on this, Patch LGTM as well,

[jira] [Commented] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330789#comment-17330789 ] Hadoop QA commented on YARN-10750: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem

[jira] [Commented] (YARN-10749) Can't remove all node labels after add node label without nodemanager port, broken by YARN-10647

2021-04-23 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330468#comment-17330468 ] Hadoop QA commented on YARN-10749: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Updated] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Gergely Pollak (Jira)
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10750: -- Description: HADOOP-17524 removed the metrics: LogFatal LogError LogWarn LogInfo

[jira] [Updated] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Gergely Pollak (Jira)
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10750: -- Description: HADOOP-17524 removed the metrics: > TestMetricsInvariantChecker.testManyRuns is

[jira] [Updated] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Gergely Pollak (Jira)
[ https://issues.apache.org/jira/browse/YARN-10750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergely Pollak updated YARN-10750: -- Attachment: YARN-10750.001.patch > TestMetricsInvariantChecker.testManyRuns is broken since

[jira] [Created] (YARN-10750) TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524

2021-04-23 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10750: - Summary: TestMetricsInvariantChecker.testManyRuns is broken since HADOOP-17524 Key: YARN-10750 URL: https://issues.apache.org/jira/browse/YARN-10750 Project:

[jira] [Commented] (YARN-10746) RmWebApp add default-node-label-expression to the queue info

2021-04-23 Thread Gergely Pollak (Jira)
[ https://issues.apache.org/jira/browse/YARN-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330380#comment-17330380 ] Gergely Pollak commented on YARN-10746: --- Test failure is unrelated, it fails without my patch as

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330372#comment-17330372 ] Hadoop QA commented on YARN-10743: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem

[jira] [Commented] (YARN-10749) Can't remove all node labels after add node label without nodemanager port, broken by YARN-10647

2021-04-23 Thread D M Murali Krishna Reddy (Jira)
[ https://issues.apache.org/jira/browse/YARN-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330348#comment-17330348 ] D M Murali Krishna Reddy commented on YARN-10749: - Thanks [~zhuqi] for the prompt review.

[jira] [Updated] (YARN-10749) Can't remove all node labels after add node label without nodemanager port, broken by YARN-10647

2021-04-23 Thread D M Murali Krishna Reddy (Jira)
[ https://issues.apache.org/jira/browse/YARN-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] D M Murali Krishna Reddy updated YARN-10749: Attachment: YARN-10749.002.patch > Can't remove all node labels after add

[jira] [Commented] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17330225#comment-17330225 ] Qi Zhu commented on YARN-10743: --- Thanks [~ebadger] [~Jim_Brennan]  for confirm. Updated the patch for

[jira] [Updated] (YARN-10743) Add a policy for not aggregating for containers which are killed because exceeding container log size limit.

2021-04-23 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10743: -- Attachment: YARN-10743.002.patch > Add a policy for not aggregating for containers which are killed because >