date:20200325

[jira] [Updated] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-03-25 Thread Prabhu Joseph (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10160:
-
Attachment: YARN-10160-006.patch

> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo
> --
>
> Key: YARN-10160
> URL: https://issues.apache.org/jira/browse/YARN-10160
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: Screen Shot 2020-02-25 at 9.06.52 PM.png, 
> YARN-10160-001.patch, YARN-10160-002.patch, YARN-10160-003.patch, 
> YARN-10160-004.patch, YARN-10160-005.patch, YARN-10160-006.patch
>
>
> Add auto queue creation related configs to 
> RMWebService#CapacitySchedulerQueueInfo.
> {code}
> yarn.scheduler.capacity..auto-create-child-queue.enabled
> yarn.scheduler.capacity..leaf-queue-template.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

2020-03-25 Thread Pranjal Protim Borah (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066611#comment-17066611
 ] 

Pranjal Protim Borah edited comment on YARN-10208 at 3/26/20, 5:34 AM:
---

Additional metric to measure time difference between node heartbeats.


was (Author: lapjarn):
[~bibinchundatt] Jira for metric schedulerHeartBeatIntervalAverage

> Add metric in CapacityScheduler for evaluating the time difference between 
> node heartbeats
> --
>
> Key: YARN-10208
> URL: https://issues.apache.org/jira/browse/YARN-10208
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Pranjal Protim Borah
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

2020-03-25 Thread Prabhu Joseph (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10194:
-
Attachment: YARN-10194-004.patch

> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
> 
>
> Key: YARN-10194
> URL: https://issues.apache.org/jira/browse/YARN-10194
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Akhil PB
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-10194-001.patch, YARN-10194-002.patch, 
> YARN-10194-003.patch, YARN-10194-004.patch
>
>
> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation 
> API creates a new CapacityScheduler and missed to close after the validation. 
> Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens 
> ZKConfigurationStore and creates a ZK Connection. 
> *ZK LOGS*
> {code}
> -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 
> times] Error accepting new connection: Too many connections from 
> /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,449 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,710 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,876 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,068 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,391 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,008 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,287 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,483 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> {code}
> And there is an another bug in ZKConfigurationStore which has not handled 
> close() of ZKCuratorManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

2020-03-25 Thread Prabhu Joseph (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067364#comment-17067364
 ] 

Prabhu Joseph commented on YARN-10194:
--

[~sunilg] Have attached  [^YARN-10194-004.patch]  after rebasing. Thanks.

> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
> 
>
> Key: YARN-10194
> URL: https://issues.apache.org/jira/browse/YARN-10194
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Akhil PB
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-10194-001.patch, YARN-10194-002.patch, 
> YARN-10194-003.patch, YARN-10194-004.patch
>
>
> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation 
> API creates a new CapacityScheduler and missed to close after the validation. 
> Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens 
> ZKConfigurationStore and creates a ZK Connection. 
> *ZK LOGS*
> {code}
> -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 
> times] Error accepting new connection: Too many connections from 
> /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,449 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,710 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,876 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,068 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,391 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,008 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,287 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,483 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> {code}
> And there is an another bug in ZKConfigurationStore which has not handled 
> close() of ZKCuratorManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

2020-03-25 Thread Sunil G (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067333#comment-17067333
 ] 

Sunil G commented on YARN-10194:


[~prabhujoseph] pls rebase to trunk

> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
> 
>
> Key: YARN-10194
> URL: https://issues.apache.org/jira/browse/YARN-10194
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Akhil PB
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-10194-001.patch, YARN-10194-002.patch, 
> YARN-10194-003.patch
>
>
> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation 
> API creates a new CapacityScheduler and missed to close after the validation. 
> Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens 
> ZKConfigurationStore and creates a ZK Connection. 
> *ZK LOGS*
> {code}
> -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 
> times] Error accepting new connection: Too many connections from 
> /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,449 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,710 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,876 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,068 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,391 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,008 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,287 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,483 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> {code}
> And there is an another bug in ZKConfigurationStore which has not handled 
> close() of ZKCuratorManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

2020-03-25 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067318#comment-17067318
 ] 

Hadoop QA commented on YARN-10194:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} YARN-10194 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-10194 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12997423/YARN-10194-003.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25750/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
> 
>
> Key: YARN-10194
> URL: https://issues.apache.org/jira/browse/YARN-10194
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Akhil PB
>Assignee: Prabhu Joseph
>Priority: Critical
> Attachments: YARN-10194-001.patch, YARN-10194-002.patch, 
> YARN-10194-003.patch
>
>
> YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation 
> API creates a new CapacityScheduler and missed to close after the validation. 
> Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens 
> ZKConfigurationStore and creates a ZK Connection. 
> *ZK LOGS*
> {code}
> -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 
> times] Error accepting new connection: Too many connections from 
> /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,449 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,710 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:52,876 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,068 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:53,391 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,008 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,287 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new 
> connection: Too many connections from /172.27.99.64 - max is 60
> 2020-03-12 16:45:54,483 WARN 
> org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting 
> new connection: Too many connections from /172.27.99.64 - max is 60
> {code}
> And there is an another bug in ZKConfigurationStore which has not handled 
> close() of ZKCuratorManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-10211) [YARN UI2] Queue selection is not highlighted on first time in queues page

2020-03-25 Thread Akhil PB (Jira)

Akhil PB created YARN-10211:
---

 Summary: [YARN UI2] Queue selection is not highlighted on first 
time in queues page
 Key: YARN-10211
 URL: https://issues.apache.org/jira/browse/YARN-10211
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Akhil PB
Assignee: Akhil PB






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10200) Add number of containers to RMAppManager summary

2020-03-25 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066941#comment-17066941
 ] 

Hudson commented on YARN-10200:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18089 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18089/])
YARN-10200. Add number of containers to RMAppManager summary (jhung: rev 
6ce189c62132706d9aaee5abf020ae4dc783ba26)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestCombinedSystemMetricsPublisher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppMetrics.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisherForV2.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/proto/yarn_server_resourcemanager_recovery.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebAppFairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/ApplicationAttemptStateDataPBImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/ApplicationAttemptStateData.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestContainerResourceUsage.java


> Add number of containers to RMAppManager summary
> 
>
> Key: YARN-10200
> URL: https://issues.apache.org/jira/browse/YARN-10200
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4, 2.10.1
>
> Attachments: YARN-10200.001.patch, YARN-10200.002.patch, 
> YARN-10200.003.patch
>
>
> It would be useful to persist this so we can track containers processed by RM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To

[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements

2020-03-25 Thread Szilard Nemeth (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066894#comment-17066894
 ] 

Szilard Nemeth commented on YARN-10043:
---

Hi [~maniraj...@gmail.com]!
Sorry for the late response, I was busy with other things in the last couple of 
weeks.
I can take a look on this tomorrow.
Next time if you have anything important like this, please reach out to other 
committers as well to get feedback more quickly :) 

> FairOrderingPolicy Improvements
> ---
>
> Key: YARN-10043
> URL: https://issues.apache.org/jira/browse/YARN-10043
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-10043.001.patch, YARN-10043.002.patch, 
> YARN-10043.003.patch, YARN-10043.004.patch
>
>
> FairOrderingPolicy can be improved by using some of the approaches (only 
> relevant) implemented in FairSharePolicy of FS. This improvement has 
> significance in FS to CS migration context.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements

2020-03-25 Thread Manikandan R (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066864#comment-17066864
 ] 

Manikandan R commented on YARN-10043:
-

[~snemeth] I am waiting on this. Can we please take it forward?

> FairOrderingPolicy Improvements
> ---
>
> Key: YARN-10043
> URL: https://issues.apache.org/jira/browse/YARN-10043
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-10043.001.patch, YARN-10043.002.patch, 
> YARN-10043.003.patch, YARN-10043.004.patch
>
>
> FairOrderingPolicy can be improved by using some of the approaches (only 
> relevant) implemented in FairSharePolicy of FS. This improvement has 
> significance in FS to CS migration context.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources

2020-03-25 Thread Manikandan R (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066859#comment-17066859
 ] 

Manikandan R commented on YARN-10154:
-

[~sunilg] Had a chance to review the patch? Thank you.

> CS Dynamic Queues cannot be configured with absolute resources
> --
>
> Key: YARN-10154
> URL: https://issues.apache.org/jira/browse/YARN-10154
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.3
>Reporter: Sunil G
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-10154.001.patch, YARN-10154.002.patch
>
>
> In CS, ManagedParent Queue and its template cannot take absolute resource 
> value like 
> [memory=8192,vcores=8]
>  Thsi Jira is to track and improve the configuration reading module of 
> DynamicQueue to support absolute resource values.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore

2020-03-25 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066762#comment-17066762
 ] 

Hadoop QA commented on YARN-10003:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  9m  
3s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
55s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 35s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}308m 39s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}377m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
|   | hadoop.yarn.server.resourcemanager.TestApplicationACLs |
|   | hadoop.yarn.server.resourcemanager.TestWorkPreservingUnmanagedAM |
|   | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector |
|   | hadoop.yarn.server.resourcemanager.placement.TestPlacementManager |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:0f25cbbb251 |
| JIRA Issue | YARN-10003 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12997642/YARN-10003.branch-3.2.POC003.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a696b1f8944b 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Commented] (YARN-10210) Add a RMFailoverProxyProvider that does DNS resolution on failover

2020-03-25 Thread Jira



[ 
https://issues.apache.org/jira/browse/YARN-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066746#comment-17066746
 ] 

Íñigo Goiri commented on YARN-10210:


HADOOP-16938 is already merged.
I also moved this to YARN as this is isolated there.
[~roliu] do you mind rebasing the PR?

> Add a RMFailoverProxyProvider that does DNS resolution on failover
> --
>
> Key: YARN-10210
> URL: https://issues.apache.org/jira/browse/YARN-10210
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Roger Liu
>Assignee: Roger Liu
>Priority: Major
>
> In Kubernetes, the a node may go down and then come back later with a 
> different IP address. YARN clients which are already running will be unable 
> to rediscover the node after it comes back up due to caching the original IP 
> address. This is problematic for cases such as Spark HA on Kubernetes, as the 
> node containing the resource manager may go down and come back up, meaning 
> existing node managers must then also be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10210) Add a RMFailoverProxyProvider that does DNS resolution on failover

2020-03-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/YARN-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-10210:
---
Summary: Add a RMFailoverProxyProvider that does DNS resolution on failover 
 (was: Cached DNS name resolution error)

> Add a RMFailoverProxyProvider that does DNS resolution on failover
> --
>
> Key: YARN-10210
> URL: https://issues.apache.org/jira/browse/YARN-10210
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Roger Liu
>Assignee: Roger Liu
>Priority: Major
>
> In Kubernetes, the a node may go down and then come back later with a 
> different IP address. YARN clients which are already running will be unable 
> to rediscover the node after it comes back up due to caching the original IP 
> address. This is problematic for cases such as Spark HA on Kubernetes, as the 
> node containing the resource manager may go down and come back up, meaning 
> existing node managers must then also be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10210) Cached DNS name resolution error

2020-03-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/YARN-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-10210:
---
Description: In Kubernetes, the a node may go down and then come back later 
with a different IP address. YARN clients which are already running will be 
unable to rediscover the node after it comes back up due to caching the 
original IP address. This is problematic for cases such as Spark HA on 
Kubernetes, as the node containing the resource manager may go down and come 
back up, meaning existing node managers must then also be restarted.  (was: In 
Kubernetes, the a node may go down and then come back later with a different IP 
address. Yarn clients which are already running will be unable to rediscover 
the node after it comes back up due to caching the original IP address. This is 
problematic for cases such as Spark HA on Kubernetes, as the node containing 
the resource manager may go down and come back up, meaning existing node 
managers must then also be restarted.)

> Cached DNS name resolution error
> 
>
> Key: YARN-10210
> URL: https://issues.apache.org/jira/browse/YARN-10210
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Roger Liu
>Assignee: Roger Liu
>Priority: Major
>
> In Kubernetes, the a node may go down and then come back later with a 
> different IP address. YARN clients which are already running will be unable 
> to rediscover the node after it comes back up due to caching the original IP 
> address. This is problematic for cases such as Spark HA on Kubernetes, as the 
> node containing the resource manager may go down and come back up, meaning 
> existing node managers must then also be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-10210) Cached DNS name resolution error

2020-03-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/YARN-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned YARN-10210:
--

Assignee: Roger Liu

> Cached DNS name resolution error
> 
>
> Key: YARN-10210
> URL: https://issues.apache.org/jira/browse/YARN-10210
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Roger Liu
>Assignee: Roger Liu
>Priority: Major
>
> In Kubernetes, the a node may go down and then come back later with a 
> different IP address. Yarn clients which are already running will be unable 
> to rediscover the node after it comes back up due to caching the original IP 
> address. This is problematic for cases such as Spark HA on Kubernetes, as the 
> node containing the resource manager may go down and come back up, meaning 
> existing node managers must then also be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Assigned] (YARN-10210) Cached DNS name resolution error

2020-03-25 Thread Jira



 [ 
https://issues.apache.org/jira/browse/YARN-10210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned YARN-10210:
--

  Key: YARN-10210  (was: HADOOP-16543)
Affects Version/s: (was: 3.1.2)
   3.1.2
 Assignee: (was: Roger Liu)
   Issue Type: Improvement  (was: Bug)
  Project: Hadoop YARN  (was: Hadoop Common)

> Cached DNS name resolution error
> 
>
> Key: YARN-10210
> URL: https://issues.apache.org/jira/browse/YARN-10210
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Roger Liu
>Priority: Major
>
> In Kubernetes, the a node may go down and then come back later with a 
> different IP address. Yarn clients which are already running will be unable 
> to rediscover the node after it comes back up due to caching the original IP 
> address. This is problematic for cases such as Spark HA on Kubernetes, as the 
> node containing the resource manager may go down and come back up, meaning 
> existing node managers must then also be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-03-25 Thread Benjamin Teke (Jira)

Benjamin Teke created YARN-10209:


 Summary: DistributedShell should initialize TimelineClient 
conditionally
 Key: YARN-10209
 URL: https://issues.apache.org/jira/browse/YARN-10209
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Benjamin Teke


YarnConfiguration was changed along with the introduction of newer Timeline 
Service versions to include configuration about the used version. In Hadoop 
2.6.0 the distributed shell instantiates Timeline Client whether if it's 
enabled in the configuration or not. Running this distributed shell on newer 
Hadoop versions (where the new Timeline Service is available) causes an 
exception, because the bundled YarnConfiguration doesn't have the necessary 
version configuration property. Making the Timeline Client initialization 
conditional the distributed shell would run at least with disabled Timeline 
Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CapacityScheduler

2020-03-25 Thread Hudson (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066628#comment-17066628
 ] 

Hudson commented on YARN-9879:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18085 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18085/])
YARN-9879. Allow multiple leaf queues with the same name in (sunilg: rev 
cdb2107066a2d8557270888c0a9a75f29a6853bf)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesSchedulerActivities.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesForCSWithPartitions.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerQueueMappingFactory.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/QueueMappingEntity.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestAbsoluteResourceConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerPerf.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCSQueueStore.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/FifoIntraQueuePreemptionPlugin.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/QueuePlacementRuleUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestReservationSystem.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/WorkflowPriorityMappingsManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestQueueParsing.java
* (edit)

[jira] [Updated] (YARN-9879) Allow multiple leaf queues with the same name in CapacityScheduler

2020-03-25 Thread Sunil G (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-9879:
--
Summary: Allow multiple leaf queues with the same name in CapacityScheduler 
 (was: Allow multiple leaf queues with the same name in CS)

> Allow multiple leaf queues with the same name in CapacityScheduler
> --
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>  Labels: fs2cs
> Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, 
> YARN-9879.014.patch, YARN-9879.015.patch, YARN-9879.015.patch, 
> YARN-9879.POC001.patch, YARN-9879.POC002.patch, YARN-9879.POC003.patch, 
> YARN-9879.POC004.patch, YARN-9879.POC005.patch, YARN-9879.POC006.patch, 
> YARN-9879.POC007.patch, YARN-9879.POC008.patch, YARN-9879.POC009.patch, 
> YARN-9879.POC010.patch, YARN-9879.POC011.patch, YARN-9879.POC012.patch, 
> YARN-9879.POC013.patch
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10200) Add number of containers to RMAppManager summary

2020-03-25 Thread Adam Antal (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066615#comment-17066615
 ] 

Adam Antal commented on YARN-10200:
---

Reviewed the patch, LGTM (non-binding).

> Add number of containers to RMAppManager summary
> 
>
> Key: YARN-10200
> URL: https://issues.apache.org/jira/browse/YARN-10200
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-10200.001.patch, YARN-10200.002.patch, 
> YARN-10200.003.patch
>
>
> It would be useful to persist this so we can track containers processed by RM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

2020-03-25 Thread Pranjal Protim Borah (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066611#comment-17066611
 ] 

Pranjal Protim Borah commented on YARN-10208:
-

[~bibinchundatt] Jira for metric schedulerHeartBeatIntervalAverage

> Add metric in CapacityScheduler for evaluating the time difference between 
> node heartbeats
> --
>
> Key: YARN-10208
> URL: https://issues.apache.org/jira/browse/YARN-10208
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Pranjal Protim Borah
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

2020-03-25 Thread Pranjal Protim Borah (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pranjal Protim Borah updated YARN-10208:

Summary: Add metric in CapacityScheduler for evaluating the time difference 
between node heartbeats  (was: Add CapacityScheduler metrics for evaluating the 
time difference between node heartbeats)

> Add metric in CapacityScheduler for evaluating the time difference between 
> node heartbeats
> --
>
> Key: YARN-10208
> URL: https://issues.apache.org/jira/browse/YARN-10208
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Pranjal Protim Borah
>Priority: Trivial
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-10208) Add CapacityScheduler metrics for evaluating the time difference between node heartbeats

2020-03-25 Thread Pranjal Protim Borah (Jira)

Pranjal Protim Borah created YARN-10208:
---

 Summary: Add CapacityScheduler metrics for evaluating the time 
difference between node heartbeats
 Key: YARN-10208
 URL: https://issues.apache.org/jira/browse/YARN-10208
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Pranjal Protim Borah






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-03-25 Thread Sunil G (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066607#comment-17066607
 ] 

Sunil G commented on YARN-9879:
---

Thanks [~shuzirra] 

Lets get this in now. +1 to the latest patch.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>  Labels: fs2cs
> Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, 
> YARN-9879.014.patch, YARN-9879.015.patch, YARN-9879.015.patch, 
> YARN-9879.POC001.patch, YARN-9879.POC002.patch, YARN-9879.POC003.patch, 
> YARN-9879.POC004.patch, YARN-9879.POC005.patch, YARN-9879.POC006.patch, 
> YARN-9879.POC007.patch, YARN-9879.POC008.patch, YARN-9879.POC009.patch, 
> YARN-9879.POC010.patch, YARN-9879.POC011.patch, YARN-9879.POC012.patch, 
> YARN-9879.POC013.patch
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

2020-03-25 Thread Gergely Pollak (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066561#comment-17066561
 ] 

Gergely Pollak commented on YARN-9879:
--

[~sunilg] yes, that is correct, it is unrelated, and SLS tests are failing 
quite often with no real reason. But to be sure I've executed the test case a 
few times manually, and it was passing properly, so this one seems to be a 
flaky one.

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>  Labels: fs2cs
> Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, 
> YARN-9879.014.patch, YARN-9879.015.patch, YARN-9879.015.patch, 
> YARN-9879.POC001.patch, YARN-9879.POC002.patch, YARN-9879.POC003.patch, 
> YARN-9879.POC004.patch, YARN-9879.POC005.patch, YARN-9879.POC006.patch, 
> YARN-9879.POC007.patch, YARN-9879.POC008.patch, YARN-9879.POC009.patch, 
> YARN-9879.POC010.patch, YARN-9879.POC011.patch, YARN-9879.POC012.patch, 
> YARN-9879.POC013.patch
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore

2020-03-25 Thread Benjamin Teke (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10003:
-
Attachment: YARN-10003.branch-3.2.POC003.patch

> YarnConfigurationStore#checkVersion throws exception that belongs to 
> RMStateStore
> -
>
> Key: YARN-10003
> URL: https://issues.apache.org/jira/browse/YARN-10003
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-10003.001.patch, YARN-10003.002.patch, 
> YARN-10003.003.patch, YARN-10003.004.patch, YARN-10003.005.patch, 
> YARN-10003.branch-3.2.001.patch, YARN-10003.branch-3.2.POC001.patch, 
> YARN-10003.branch-3.2.POC002.patch, YARN-10003.branch-3.2.POC003.patch
>
>
> RMStateVersionIncompatibleException is thrown from method "checkVersion".
> Moreover, there's a TODO here saying this method is copied from RMStateStore. 
> We should revise this method a bit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9997) Code cleanup in ZKConfigurationStore

2020-03-25 Thread Andras Gyori (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066471#comment-17066471
 ] 

Andras Gyori commented on YARN-9997:


The backport is ready to be merged, however, I am waiting for the update of 
[YARN-10002|https://issues.apache.org/jira/browse/YARN-10002], whether that is 
possible to be backported as well. In that case, YARN-10002 should be merged 
first to avoid conflicts.

> Code cleanup in ZKConfigurationStore
> 
>
> Key: YARN-9997
> URL: https://issues.apache.org/jira/browse/YARN-9997
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Andras Gyori
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9997.001.patch, YARN-9997.002.patch, 
> YARN-9997.003.patch, YARN-9997.004.patch, YARN-9997.005.patch, 
> YARN-9997.006.patch
>
>
> Many thins can be improved:
> * znodeParentPath could be a local variable
> * zkManager could be private, VisibleForTesting annotation is not needed 
> anymore
> * Do something with unchecked casts
> * zkManager.safeSetData calls are almost having the same set of parameters: 
> Simplify this
> * Extract zkManager calls to their own methods: They are repeated
> * Remove TODOs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-9354) Resources should be created with ResourceTypesTestHelper instead of TestUtils

2020-03-25 Thread Andras Gyori (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066468#comment-17066468
 ] 

Andras Gyori commented on YARN-9354:


A new patch has been submitted for the branch-3.2 backport, the failing unit 
tests are unrelated.

> Resources should be created with ResourceTypesTestHelper instead of TestUtils
> -
>
> Key: YARN-9354
> URL: https://issues.apache.org/jira/browse/YARN-9354
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Andras Gyori
>Priority: Trivial
>  Labels: newbie, newbie++
> Fix For: 3.3.0
>
> Attachments: YARN-9354.001.patch, YARN-9354.002.patch, 
> YARN-9354.003.patch, YARN-9354.004.patch, YARN-9354.branch-3.2.001.patch, 
> YARN-9354.branch-3.2.002.patch, YARN-9354.branch-3.2.003.patch
>
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestUtils#createResource
>  has not identical, but very similar implementation to 
> org.apache.hadoop.yarn.resourcetypes.ResourceTypesTestHelper#newResource. 
> Since these 2 methods are doing the same essentially and 
> ResourceTypesTestHelper is newer and used more, TestUtils#createResource 
> should be replaced with ResourceTypesTestHelper#newResource with all 
> occurrence.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-03-25 Thread Siddharth Ahuja (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10207:
---
Description: 
File descriptor leaks are observed coming from the JobHistoryServer process 
while it tries to render a "corrupted" aggregated log on the JHS Web UI.

Issue reproduced using the following steps:

# Ran a sample Hadoop MR Pi job, it had the id - application_1582676649923_0026.
# Copied an aggregated log file from HDFS to local FS:
{code}
hdfs dfs -get 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Updated the TFile metadata at the bottom of this file with some junk to 
corrupt the file :
*Before:*
{code}

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
{code}
*After:*
{code}  

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
{code}
Notice "blah" (junk) added at the very end.
# Remove the existing aggregated log file that will need to be replaced by our 
modified copy from step 3 (as otherwise HDFS will prevent it from placing the 
file with the same name as it already exists):
{code}
hdfs dfs -rm -r -f 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Upload the corrupted aggregated file back to HDFS:
{code}
hdfs dfs -put _8041 
/tmp/logs/systest/logs/application_1582676649923_0026
{code}
# Visit HistoryServer Web UI
# Click on job_1582676649923_0026
# Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
# Review the JHS logs, following exception will be seen:
{code}
2020-03-24 20:03:48,484 ERROR org.apache.hadoop.yarn.webapp.View: Error 
getting logs for job_1582676649923_0026
java.io.IOException: Not a valid BCFile.
at 
org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
at 
org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
at 
org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
at 
org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
at 
org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at 
org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:162)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-03-25 Thread Siddharth Ahuja (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10207:
---
Description: 
Issue reproduced using the following steps:

# Ran a sample Hadoop MR Pi job, it had the id - application_1582676649923_0026.
# Copied an aggregated log file from HDFS to local FS:
{code}
hdfs dfs -get 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Updated the TFile metadata at the bottom of this file with some junk to 
corrupt the file :
*Before:*
{code}

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
{code}
*After:*
{code}  

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
{code}
Notice "blah" (junk) added at the very end.
# Remove the existing aggregated log file that will need to be replaced by our 
modified copy from step 3 (as otherwise HDFS will prevent it from placing the 
file with the same name as it already exists):
{code}  
hdfs dfs -rm -r -f 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Upload the corrupted aggregated file back to HDFS:
{code}
hdfs dfs -put _8041 
/tmp/logs/systest/logs/application_1582676649923_0026
{code}
# Visit HistoryServer Web UI
# Click on job_1582676649923_0026
# Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
# Review the JHS logs, following exception will be seen:
{code}
2020-03-24 20:03:48,484 ERROR org.apache.hadoop.yarn.webapp.View: Error 
getting logs for job_1582676649923_0026
java.io.IOException: Not a valid BCFile.
at 
org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
at 
org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
at 
org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
at 
org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
at 
org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at 
org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:162)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:941)
at 
com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:875)
at

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-03-25 Thread Siddharth Ahuja (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10207:
---
Description: 
Issue reproduced using the following steps:

# Ran a sample Hadoop MR Pi job, it had the id - application_1582676649923_0026.
# Copied an aggregated log file from HDFS to local FS:
{code}
hdfs dfs -get 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Updated the TFile metadata at the bottom of this file with some junk to 
corrupt the file :
*Before:*
{code}

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
{code}
*After:*
{code}  

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
{code}
Notice "blah" (junk) added at the very end.
# Remove the existing aggregated log file that will need to be replaced by our 
modified copy from step 3 (as otherwise HDFS will prevent it from placing the 
file with the same name as it already exists):
{code}  
hdfs dfs -rm -r -f 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Upload the corrupted aggregated file back to HDFS:
{code}
hdfs dfs -put _8041 
/tmp/logs/systest/logs/application_1582676649923_0026
{code}
# Visit HistoryServer Web UI
# Click on job_1582676649923_0026
# Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
# Review the JHS logs, following exception will be seen:
{code}
2020-03-24 20:03:48,484 ERROR 
org.apache.hadoop.yarn.webapp.View: Error getting logs for 
job_1582676649923_0026
java.io.IOException: Not a valid BCFile.
at 
org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
at 
org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
at 
org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
at 
org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at 
org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
at 
org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at 
org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:162)
at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-03-25 Thread Siddharth Ahuja (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10207:
---
Description: 
Issue reproduced using the following steps:

# Ran a sample Hadoop MR Pi job, it had the id - application_1582676649923_0026.
# Copied an aggregated log file from HDFS to local FS:
{code}
hdfs dfs -get 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Updated the TFile metadata at the bottom of this file with some junk to 
corrupt the file :
*Before:*
{code}

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
{code}
*After:*
{code}  

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
{code}
Notice "blah" added at the very end.
# Remove the existing aggregated log file that will need to be replaced by our 
modified copy from step 3 (as otherwise HDFS will prevent it from placing the 
file with the same name as it already exists):
{code}  
hdfs dfs -rm -r -f 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Upload the corrupted aggregated file back to HDFS:
{code}
hdfs dfs -put _8041 
/tmp/logs/systest/logs/application_1582676649923_0026
{code}
# Visit HistoryServer Web UI
# Click on job_1582676649923_0026
# Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
# Review the JHS logs, following exception will be seen:
{code}
2020-03-24 20:03:48,484 ERROR 
org.apache.hadoop.yarn.webapp.View: Error getting logs for 
job_1582676649923_0026
java.io.IOException: Not a valid BCFile.
at 
org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
at 
org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
at 
org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
at 
org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at 
org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
at 
org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at 
org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:162)
at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)
at 
com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
at 
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)

[jira] [Assigned] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-03-25 Thread Siddharth Ahuja (Jira)



 [ 
https://issues.apache.org/jira/browse/YARN-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja reassigned YARN-10207:
--

Assignee: Siddharth Ahuja

> CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated 
> logs on the JobHistoryServer Web UI
> -
>
> Key: YARN-10207
> URL: https://issues.apache.org/jira/browse/YARN-10207
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
>
> Issue reproduced using the following steps:
> # Ran a sample Hadoop MR Pi job, it had the id - 
> application_1582676649923_0026.
> # Copied an aggregated log file from HDFS to local FS:
> {code}
> hdfs dfs -get 
> /tmp/logs/systest/logs/application_1582676649923_0026/_8041
> {code}
> # Updated the TFile metadata at the bottom of this file with some junk to 
> corrupt the file :
> *Before:*
> {code}
>   
> ^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
> {code}
> *After:*
> {code}
>   
> ^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
> {code}
> Notice "blah" added at the very end.
> # Remove the existing aggregated log file that will need to be replaced by 
> our modified copy from step 3 (as otherwise HDFS will prevent it from placing 
> the file with the same name as it already exists):
> {code}
> hdfs dfs -rm -r -f 
> /tmp/logs/systest/logs/application_1582676649923_0026/_8041
> {code}
> # Upload the corrupted aggregated file back to HDFS:
> {code}
> hdfs dfs -put _8041 
> /tmp/logs/systest/logs/application_1582676649923_0026
> {code}
> # Visit HistoryServer Web UI
> # Click on job_1582676649923_0026
> # Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
> # Review the JHS logs, following exception will be seen:
> {code}
>   2020-03-24 20:03:48,484 ERROR 
> org.apache.hadoop.yarn.webapp.View: Error getting logs for 
> job_1582676649923_0026
>   java.io.IOException: Not a valid BCFile.
>   at 
> org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
>   at 
> org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
>   at 
> org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
>   at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
>   at 
> org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>   at 
> org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>   at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>   at 
> org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
>   at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>   at 
> org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
>   at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
>

[jira] [Created] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-03-25 Thread Siddharth Ahuja (Jira)

Siddharth Ahuja created YARN-10207:
--

 Summary: CLOSE_WAIT socket connection leaks during rendering of 
(corrupted) aggregated logs on the JobHistoryServer Web UI
 Key: YARN-10207
 URL: https://issues.apache.org/jira/browse/YARN-10207
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Siddharth Ahuja


Issue reproduced using the following steps:

# Ran a sample Hadoop MR Pi job, it had the id - application_1582676649923_0026.
# Copied an aggregated log file from HDFS to local FS:
{code}
hdfs dfs -get 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Updated the TFile metadata at the bottom of this file with some junk to 
corrupt the file :
*Before:*
{code}

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
{code}

*After:*

{code}  

^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
{code}  
Notice "blah" added at the very end.
# Remove the existing aggregated log file that will need to be replaced by our 
modified copy from step 3 (as otherwise HDFS will prevent it from placing the 
file with the same name as it already exists):
{code}  
hdfs dfs -rm -r -f 
/tmp/logs/systest/logs/application_1582676649923_0026/_8041
{code}
# Upload the corrupted aggregated file back to HDFS:
{code}
hdfs dfs -put _8041 
/tmp/logs/systest/logs/application_1582676649923_0026
{code}
# Visit HistoryServer Web UI
# Click on job_1582676649923_0026
# Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
# Review the JHS logs, following exception will be seen:
{code}
2020-03-24 20:03:48,484 ERROR 
org.apache.hadoop.yarn.webapp.View: Error getting logs for 
job_1582676649923_0026
java.io.IOException: Not a valid BCFile.
at 
org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
at 
org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
at 
org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
at 
org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
at 
org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
at 
org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
at 
org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
at 
org.apache.hadoop.yarn.webapp.View.render(View.java:235)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
at 
org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
at 
org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
at 
org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
at 
org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
at 
org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:162)
at 
javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at 
com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:287)
at 
com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:277)
at 
com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:182)

[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

2020-03-25 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/YARN-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066406#comment-17066406
 ] 

Hadoop QA commented on YARN-10160:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
22s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 18s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 9 new + 78 unchanged - 0 fixed = 87 total (was 78) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m  
3s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 36s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 36s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:4454c6d14b7 |
| JIRA Issue | YARN-10160 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12997422/YARN-10160-005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 33dacd869cba 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d353b30 |
| maven | version: Apache Maven

[jira] [Updated] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

[jira] [Comment Edited] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

[jira] [Updated] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections

[jira] [Created] (YARN-10211) [YARN UI2] Queue selection is not highlighted on first time in queues page

[jira] [Commented] (YARN-10200) Add number of containers to RMAppManager summary

[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements

[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements

[jira] [Commented] (YARN-10154) CS Dynamic Queues cannot be configured with absolute resources

[jira] [Commented] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore

[jira] [Commented] (YARN-10210) Add a RMFailoverProxyProvider that does DNS resolution on failover

[jira] [Updated] (YARN-10210) Add a RMFailoverProxyProvider that does DNS resolution on failover

[jira] [Updated] (YARN-10210) Cached DNS name resolution error

[jira] [Assigned] (YARN-10210) Cached DNS name resolution error

[jira] [Assigned] (YARN-10210) Cached DNS name resolution error

[jira] [Created] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CapacityScheduler

[jira] [Updated] (YARN-9879) Allow multiple leaf queues with the same name in CapacityScheduler

[jira] [Commented] (YARN-10200) Add number of containers to RMAppManager summary

[jira] [Commented] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

[jira] [Updated] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

[jira] [Created] (YARN-10208) Add CapacityScheduler metrics for evaluating the time difference between node heartbeats

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS

[jira] [Updated] (YARN-10003) YarnConfigurationStore#checkVersion throws exception that belongs to RMStateStore

[jira] [Commented] (YARN-9997) Code cleanup in ZKConfigurationStore

[jira] [Commented] (YARN-9354) Resources should be created with ResourceTypesTestHelper instead of TestUtils

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

[jira] [Updated] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

[jira] [Assigned] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

[jira] [Created] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

[jira] [Commented] (YARN-10160) Add auto queue creation related configs to RMWebService#CapacitySchedulerQueueInfo

36 matches

Site Navigation

Mail list logo

Footer information