[jira] [Commented] (YARN-10029) Add option to UIv2 to get container logs from the new JHS API

2020-01-20 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019898#comment-17019898
 ] 

Adam Antal commented on YARN-10029:
---

Uploaded patch v2 which is still work in progress.

During testing it turned our that we need CORS support for JHS web UI - filed 
YARN-10097 for this improvement. Until that the issue is blocked.

> Add option to UIv2 to get container logs from the new JHS API
> -
>
> Key: YARN-10029
> URL: https://issues.apache.org/jira/browse/YARN-10029
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-10029.001.patch
>
>
> Provided the new API is ready to use (also integrated into JHS in 
> YARN-10028), we can add a new config option to UIv2 that would make the UIv2 
> to request logs from the JHS API similarly as the ATSv2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10097) Cross origin request support for Job history server web UI

2020-01-20 Thread Adam Antal (Jira)
Adam Antal created YARN-10097:
-

 Summary: Cross origin request support for Job history server web UI
 Key: YARN-10097
 URL: https://issues.apache.org/jira/browse/YARN-10097
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Affects Versions: 3.3.0
Reporter: Adam Antal
Assignee: Adam Antal


The major web UIs in YARN support configuring the header, but somehow for the 
JHS web UI there wasn't any use case. We need it in YARN-10029, so let's 
implement it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10095) Fix help message for yarn rmadmin

2020-01-20 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated YARN-10095:
--
Description: 
(This issue is identified by [~aajisaka] in 
https://issues.apache.org/jira/browse/HADOOP-16753)

The help message of yarn rmadmin seems broken.

Current:  
{code:java}
$ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
-refreshNodesResources
Usage: yarn rmadmin [-refreshNodesResources]

Generic options supported are:
-conf specify an application configuration file
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-jt   specify a ResourceManager
-files specify a comma-separated list of files to be 
copied to the map reduce cluster
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]
{code}
 

Expected: 
{code:java}
$ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
-refreshNodesResources 
Usage: yarn rmadmin [-refreshNodesResources]

   -refreshNodesResources: Refresh resources of NodeManagers at the 
ResourceManager.< HERE

Generic options supported are:
-conf specify an application configuration file
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-jt   specify a ResourceManager
-files specify a comma-separated list of files to be 
copied to the map reduce cluster
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machines

The general command line syntax is: 
command [genericOptions] [commandOptions]
{code}
 

 

  was:
(This issue is identified by [~aajisaka] in 
https://issues.apache.org/jira/browse/HADOOP-16753)

The help message of yarn rmadmin seems broken.

Current:  
{code:java}
$ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
-refreshNodesResources
Usage: yarn rmadmin [-refreshNodesResources]

Generic options supported are:
-conf specify an application configuration file
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-jt   specify a ResourceManager
-files specify a comma-separated list of files to be 
copied to the map reduce cluster
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]
{code}
 

Expected: 
{code:java}
$ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
-refreshNodesResources Usage: yarn rmadmin [-refreshNodesResources]

Generic options supported are: -conf  specify an 
application configuration file -D  define a value for a given 
property -fs  specify default filesystem URL to 
use, overrides 'fs.defaultFS' property from configurations. -jt 
 specify a ResourceManager -files  
specify a comma-separated list of files to be copied to the map reduce cluster 
-libjars  specify a comma-separated list of jar files to be included 
in the classpath -archives  specify a comma-separated list of 
archives to be unarchived on the compute machines

The general command line syntax is: command [genericOptions] [commandOptions]
{code}
 

 


> Fix help message for yarn rmadmin
> -
>
> Key: YARN-10095
> URL: https://issues.apache.org/jira/browse/YARN-10095
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Minor
>
> (This issue is identified by [~aajisaka] in 
> https://issues.apache.org/jira/browse/HADOOP-16753)
> The help message of yarn rmadmin seems broken.
> Current:  
> {code:java}
> $ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
> -refreshNodesResources
> Usage: yarn rmadmin [-refreshNodesResources]
> Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included 

[jira] [Updated] (YARN-10095) Fix help message for yarn rmadmin

2020-01-20 Thread Xieming Li (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xieming Li updated YARN-10095:
--
Description: 
(This issue is identified by [~aajisaka] in 
https://issues.apache.org/jira/browse/HADOOP-16753)

The help message of yarn rmadmin seems broken.

Current:  
{code:java}
$ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
-refreshNodesResources
Usage: yarn rmadmin [-refreshNodesResources]

Generic options supported are:
-conf specify an application configuration file
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-jt   specify a ResourceManager
-files specify a comma-separated list of files to be 
copied to the map reduce cluster
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]
{code}
 

Expected: 
{code:java}
$ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
-refreshNodesResources Usage: yarn rmadmin [-refreshNodesResources]

Generic options supported are: -conf  specify an 
application configuration file -D  define a value for a given 
property -fs  specify default filesystem URL to 
use, overrides 'fs.defaultFS' property from configurations. -jt 
 specify a ResourceManager -files  
specify a comma-separated list of files to be copied to the map reduce cluster 
-libjars  specify a comma-separated list of jar files to be included 
in the classpath -archives  specify a comma-separated list of 
archives to be unarchived on the compute machines

The general command line syntax is: command [genericOptions] [commandOptions]
{code}
 

 

  was:
(This issue is identified by [~aajisaka] in 
https://issues.apache.org/jira/browse/HADOOP-16753)

The help message of yarn rmadmin seems broken.

Current:  
{code:java}
$ yarn rmadmin -help refreshNodes 2>/dev/null
$
$ yarn rmadmin -help refreshNodes
Usage: yarn rmadmin
   -refreshQueues
   -refreshNodes [-g|graceful [timeout in seconds] -client|server]
   -refreshNodesResources
   -refreshSuperUserGroupsConfiguration
   -refreshUserToGroupsMappings
   -refreshAdminAcls
   -refreshServiceAcl
   -getGroups [username]
   -addToClusterNodeLabels 
<"label1(exclusive=true),label2(exclusive=false),label3">
   -removeFromClusterNodeLabels  (label splitted by ",")
   -replaceLabelsOnNode <"node1[:port]=label1,label2 
node2[:port]=label1,label2"> [-failOnUnknownNodes]
   -directlyAccessNodeLabelStore
   -refreshClusterMaxPriority
   -updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
or
[NodeID] [resourcetypes] ([OvercommitTimeout]).
   -help [cmd]Generic options supported are:
-conf specify an application configuration file
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-jt   specify a ResourceManager
-files specify a comma-separated list of files to be 
copied to the map reduce cluster
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machinesThe general command line syntax is:
command [genericOptions] [commandOptions]
{code}
 

 

Expected: 
{code:java}
$ yarn rmadmin -help refreshNodes 2>/dev/null
 -refreshNodes [-g|graceful [timeout in seconds] -client|server]

$ yarn rmadmin -help refreshNodes
 -refreshNodes [-g|graceful [timeout in seconds] -client|server]
{code}
 

 


> Fix help message for yarn rmadmin
> -
>
> Key: YARN-10095
> URL: https://issues.apache.org/jira/browse/YARN-10095
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Minor
>
> (This issue is identified by [~aajisaka] in 
> https://issues.apache.org/jira/browse/HADOOP-16753)
> The help message of yarn rmadmin seems broken.
> Current:  
> {code:java}
> $ yarn rmadmin -Dyarn.resourcemanager.ha.enabled=true -help 
> -refreshNodesResources
> Usage: yarn rmadmin [-refreshNodesResources]
> Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar 

[jira] [Comment Edited] (YARN-10095) Fix help message for yarn rmadmin

2020-01-20 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019791#comment-17019791
 ] 

Xieming Li edited comment on YARN-10095 at 1/21/20 1:29 AM:


[~yehuanhuan] 

Thank your for pointing out,

my original description was not correct and I fixed it.

 

 


was (Author: risyomei):
[~yehuanhuan]

Thank you for pointing out. 

I have checked "hdfs haadmin -help" and it takes commands without leading '-'. 

Now, I am not sure if I should fix "yarn rmadmin" here or the "hdfs haadmin", 
or don't fix any of them at all.

> Fix help message for yarn rmadmin
> -
>
> Key: YARN-10095
> URL: https://issues.apache.org/jira/browse/YARN-10095
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Minor
>
> (This issue is identified by [~aajisaka] in 
> https://issues.apache.org/jira/browse/HADOOP-16753)
> The help message of yarn rmadmin seems broken.
> Current:  
> {code:java}
> $ yarn rmadmin -help refreshNodes 2>/dev/null
> $
> $ yarn rmadmin -help refreshNodes
> Usage: yarn rmadmin
>-refreshQueues
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources
>-refreshSuperUserGroupsConfiguration
>-refreshUserToGroupsMappings
>-refreshAdminAcls
>-refreshServiceAcl
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes]
>-directlyAccessNodeLabelStore
>-refreshClusterMaxPriority
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
> or
> [NodeID] [resourcetypes] ([OvercommitTimeout]).
>-help [cmd]Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machinesThe general command line syntax is:
> command [genericOptions] [commandOptions]
> {code}
>  
>  
> Expected: 
> {code:java}
> $ yarn rmadmin -help refreshNodes 2>/dev/null
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]
> $ yarn rmadmin -help refreshNodes
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10095) Fix help message for yarn rmadmin

2020-01-20 Thread Xieming Li (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019791#comment-17019791
 ] 

Xieming Li commented on YARN-10095:
---

[~yehuanhuan]

Thank you for pointing out. 

I have checked "hdfs haadmin -help" and it takes commands without leading '-'. 

Now, I am not sure if I should fix "yarn rmadmin" here or the "hdfs haadmin", 
or don't fix any of them at all.

> Fix help message for yarn rmadmin
> -
>
> Key: YARN-10095
> URL: https://issues.apache.org/jira/browse/YARN-10095
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Minor
>
> (This issue is identified by [~aajisaka] in 
> https://issues.apache.org/jira/browse/HADOOP-16753)
> The help message of yarn rmadmin seems broken.
> Current:  
> {code:java}
> $ yarn rmadmin -help refreshNodes 2>/dev/null
> $
> $ yarn rmadmin -help refreshNodes
> Usage: yarn rmadmin
>-refreshQueues
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources
>-refreshSuperUserGroupsConfiguration
>-refreshUserToGroupsMappings
>-refreshAdminAcls
>-refreshServiceAcl
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes]
>-directlyAccessNodeLabelStore
>-refreshClusterMaxPriority
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
> or
> [NodeID] [resourcetypes] ([OvercommitTimeout]).
>-help [cmd]Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machinesThe general command line syntax is:
> command [genericOptions] [commandOptions]
> {code}
>  
>  
> Expected: 
> {code:java}
> $ yarn rmadmin -help refreshNodes 2>/dev/null
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]
> $ yarn rmadmin -help refreshNodes
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10083) Provide utility to ask whether an application is in final status

2020-01-20 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019727#comment-17019727
 ] 

Hadoop QA commented on YARN-10083:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  8m 
40s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
13s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
28s{color} | {color:green} root: The patch generated 0 new + 259 unchanged - 3 
fixed = 259 total (was 262) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
57s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
39s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
41s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 82m 
36s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 
11s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
5s{color} | {color:green} hadoop-yarn-server-timeline-pluginstorage in the 
patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}122m 23s{color} 
| {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
44s{color} | {color:green} hadoop-dynamometer-infra in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m  
1s{color} | {color:red} 

[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2020-01-20 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019628#comment-17019628
 ] 

Manikandan R commented on YARN-9768:


[~inigoiri] Can you please review his comment and commit the code?

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9462) TestResourceTrackerService.testNodeRemovalGracefully fails sporadically

2020-01-20 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019563#comment-17019563
 ] 

Hadoop QA commented on YARN-9462:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
37s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 48s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:0f25cbbb251 |
| JIRA Issue | YARN-9462 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991367/YARN-9462-branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2db390d964b6 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / c411a14 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/25409/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25409/testReport/ |
| Max. process+thread count | 795 (vs. ulimit of 5500) |
| 

[jira] [Commented] (YARN-10085) FS-CS converter: remove mixed ordering policy check

2020-01-20 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019548#comment-17019548
 ] 

Peter Bacsko commented on YARN-10085:
-

[~snemeth] / [~adam.antal] could you pls review this patch?

> FS-CS converter: remove mixed ordering policy check
> ---
>
> Key: YARN-10085
> URL: https://issues.apache.org/jira/browse/YARN-10085
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-10085-001.patch, YARN-10085-002.patch
>
>
> In the converter, this part is very strict and probably unnecessary:
> {noformat}
> // Validate ordering policy
> if (queueConverter.isDrfPolicyUsedOnQueueLevel()) {
>   if (queueConverter.isFifoOrFairSharePolicyUsed()) {
> throw new ConversionException(
> "DRF ordering policy cannot be used together with fifo/fair");
>   } else {
> capacitySchedulerConfig.set(
> CapacitySchedulerConfiguration.RESOURCE_CALCULATOR_CLASS,
> DominantResourceCalculator.class.getCanonicalName());
>   }
> }
> {noformat}
> It's also misleading, because Fair policy can be used under DRF, so the error 
> message is incorrect.
> Let's remove these checks and rewrite the converter in a way that it 
> generates a valid config even if fair/drf is somehow mixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10085) FS-CS converter: remove mixed ordering policy check

2020-01-20 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019516#comment-17019516
 ] 

Hadoop QA commented on YARN-10085:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 82m 
20s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | YARN-10085 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991372/YARN-10085-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux e6e4cdad30a9 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6d52bbb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_232 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25408/testReport/ |
| Max. process+thread count | 885 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25408/console |
| 

[jira] [Commented] (YARN-9742) [JDK11] TestTimelineWebServicesWithSSL.testPutEntities fails

2020-01-20 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019484#comment-17019484
 ] 

Adam Antal commented on YARN-9742:
--

Thanks for checking this [~weichiu], [~aajisaka].

> [JDK11] TestTimelineWebServicesWithSSL.testPutEntities fails
> 
>
> Key: YARN-9742
> URL: https://issues.apache.org/jira/browse/YARN-9742
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineservice
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Major
>
> Tested on openjdk-11.0.2 on a Mac.
> Stack trace:
> {noformat}
> [ERROR] Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 8.206 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL
> [ERROR] 
> testPutEntities(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL)
>   Time elapsed: 0.366 s  <<< ERROR!
> com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: HTTPS 
> hostname wrong:  should be <0.0.0.0>
>   at 
> com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter$1.run(TimelineConnector.java:392)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:335)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter.handle(TimelineConnector.java:405)
>   at com.sun.jersey.api.client.Client.handle(Client.java:652)
>   at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
>   at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
>   at 
> com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:570)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:152)
>   at 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL$TestTimelineClient$1.doPostingObject(TestTimelineWebServicesWithSSL.java:139)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:92)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:178)
>   at 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServicesWithSSL.testPutEntities(TestTimelineWebServicesWithSSL.java:110)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> 

[jira] [Commented] (YARN-10083) Provide utility to ask whether an application is in final status

2020-01-20 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019477#comment-17019477
 ] 

Adam Antal commented on YARN-10083:
---

Uploaded patch v3 with the modifications. Thanks for the review [~snemeth]!

> Provide utility to ask whether an application is in final status
> 
>
> Key: YARN-10083
> URL: https://issues.apache.org/jira/browse/YARN-10083
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Minor
> Attachments: YARN-10083.001.patch, YARN-10083.002.patch, 
> YARN-10083.002.patch, YARN-10083.003.patch, YARN-10083.branch-3.2.001.patch
>
>
> This code part is severely duplicated across the Hadoop repo:
> {code:java}
>   public static boolean isApplicationFinalState(YarnApplicationState 
> appState) {
> return appState == YarnApplicationState.FINISHED
> || appState == YarnApplicationState.FAILED
> || appState == YarnApplicationState.KILLED;
>   }
> {code}
> This functionality is used heavily by the log aggregation as well, so we may 
> do some sanitizing here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10083) Provide utility to ask whether an application is in final status

2020-01-20 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-10083:
--
Attachment: YARN-10083.003.patch

> Provide utility to ask whether an application is in final status
> 
>
> Key: YARN-10083
> URL: https://issues.apache.org/jira/browse/YARN-10083
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Minor
> Attachments: YARN-10083.001.patch, YARN-10083.002.patch, 
> YARN-10083.002.patch, YARN-10083.003.patch, YARN-10083.branch-3.2.001.patch
>
>
> This code part is severely duplicated across the Hadoop repo:
> {code:java}
>   public static boolean isApplicationFinalState(YarnApplicationState 
> appState) {
> return appState == YarnApplicationState.FINISHED
> || appState == YarnApplicationState.FAILED
> || appState == YarnApplicationState.KILLED;
>   }
> {code}
> This functionality is used heavily by the log aggregation as well, so we may 
> do some sanitizing here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019444#comment-17019444
 ] 

Hudson commented on YARN-7913:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17878 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17878/])
YARN-7913. Improve error handling when application recovery fails with 
(snemeth: rev 581072a8f04f7568d3560f105fd1988d3acc9e54)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java


> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019440#comment-17019440
 ] 

Szilard Nemeth commented on YARN-7913:
--

Hi [~wilfreds],
Do you want to add branch-3.2 / branch-3.1 patches as well?

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019438#comment-17019438
 ] 

Szilard Nemeth commented on YARN-7913:
--

Committed to trunk.
Thanks [~wilfreds] for the contribution and [~grepas] for the initial testing 
and the PoC.

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-20 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-7913:
-
Fix Version/s: 3.3.0

> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7913) Improve error handling when application recovery fails with exception

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019436#comment-17019436
 ] 

Szilard Nemeth commented on YARN-7913:
--

Hi [~wilfreds],

1. I see.
2. Ack'd
3. I was tired when checking the code, most probably, as I haven't noticed the 
return statement from this block: 

{code:java}
if (!isAppRecovering) {
  rejectApplicationWithMessage(applicationId,
  queueName + " is not a leaf queue");
  return;
}
{code}
So all good with this.

4. Thanks
5. Okay, I see the point now, got it.
6. Sure, just wanted to ask you about this.

So in overall, patch LGTM, committing to trunk.


> Improve error handling when application recovery fails with exception
> -
>
> Key: YARN-7913
> URL: https://issues.apache.org/jira/browse/YARN-7913
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0
>Reporter: Gergo Repas
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-7913.000.poc.patch, YARN-7913.001.patch, 
> YARN-7913.002.patch, YARN-7913.003.patch
>
>
> There are edge cases when the application recovery fails with an exception.
> Example failure scenario:
>  * setup: a queue is a leaf queue in the primary RM's config and the same 
> queue is a parent queue in the secondary RM's config.
>  * When failover happens with this setup, the recovery will fail for 
> applications on this queue, and an APP_REJECTED event will be dispatched to 
> the async dispatcher. On the same thread (that handles the recovery), a 
> NullPointerException is thrown when the applicationAttempt is tried to be 
> recovered 
> (https://github.com/apache/hadoop/blob/55066cc53dc22b68f9ca55a0029741d6c846be0a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java#L494).
>  I don't see a good way to avoid the NPE in this scenario, because when the 
> NPE occurs the APP_REJECTED has not been processed yet, and we don't know 
> that the application recovery failed.
> Currently the first exception will abort the recovery, and if there are X 
> applications, there will be ~X passive -> active RM transition attempts - the 
> passive -> active RM transition will only succeed when the last APP_REJECTED 
> event is processed on the async dispatcher thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-20 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019433#comment-17019433
 ] 

Adam Antal commented on YARN-9525:
--

Thanks for the commit [~snemeth]!

HADOOP-15691 has Fix Version 3.3.0 (it is not backported to 3.2/3.1) and this 
commit depends on that one, therefore cherry-picking is not needed.

> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch, 
> YARN-9525.007.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019428#comment-17019428
 ] 

Hudson commented on YARN-9525:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17877 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17877/])
YARN-9525. IFile format is not working against s3a remote folder. (snemeth: rev 
6d52bbbfcfd7750b7e547abdcd0d14632d6ed9b6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/filecontroller/ifile/LogAggregationIndexedFileController.java


> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch, 
> YARN-9525.007.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, 

[jira] [Comment Edited] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019412#comment-17019412
 ] 

Szilard Nemeth edited comment on YARN-9525 at 1/20/20 11:40 AM:


Thanks [~adam.antal] for this patch,
Just to be on the safe side, just went through and read all comments in this 
jira again, still makes sense.
Latest patch LGTM, committed to trunk.
Thanks [~pbacsko] for the POC patch and [~wangda], [~ste...@apache.org] for the 
reviews.



was (Author: snemeth):
Thanks [~adam.antal], 
Latest patch LGTM, committed to trunk.
Thanks [~pbacsko] for the POC patch and [~wangda], [~ste...@apache.org] for the 
reviews.


> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch, 
> YARN-9525.007.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019415#comment-17019415
 ] 

Szilard Nemeth commented on YARN-9525:
--

[~adam.antal] Please make sure to add branch-3.2 / branch-3.1 patches, if you 
want.

> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch, 
> YARN-9525.007.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2020-01-20 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019413#comment-17019413
 ] 

kailiu_dev commented on YARN-2902:
--

Hi , [~wilfreds]  [~jlowe]

Hudson I find a bug,NM has many dirs such as " 
nm-local-dir/filecache/1889677_tmp" not clear and leak on disk, do you know 
why?  my hadoop version is hadoop-2.7.2

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Darrell Lowe
>Assignee: Varun Saxena
>Priority: Major
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: YARN-2902-branch-2.6.01.patch, YARN-2902.002.patch, 
> YARN-2902.03.patch, YARN-2902.04.patch, YARN-2902.05.patch, 
> YARN-2902.06.patch, YARN-2902.07.patch, YARN-2902.08.patch, 
> YARN-2902.09.patch, YARN-2902.10.patch, YARN-2902.11.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-20 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9525:
-
Fix Version/s: 3.3.0

> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch, 
> YARN-9525.007.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019412#comment-17019412
 ] 

Szilard Nemeth commented on YARN-9525:
--

Thanks [~adam.antal], 
Latest patch LGTM, committed to trunk.
Thanks [~pbacsko] for the POC patch and [~wangda], [~ste...@apache.org] for the 
reviews.


> IFile format is not working against s3a remote folder
> -
>
> Key: YARN-9525
> URL: https://issues.apache.org/jira/browse/YARN-9525
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation
>Affects Versions: 3.1.2
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch, 
> YARN-9525.002.patch, YARN-9525.003.patch, YARN-9525.004.patch, 
> YARN-9525.005.patch, YARN-9525.006.patch, YARN-9525.006.patch, 
> YARN-9525.007.patch
>
>
> Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} 
> configured to an s3a URI throws the following exception during log 
> aggregation:
> {noformat}
> Cannot create writer for app application_1556199768861_0001. Skip log upload 
> this time. 
> java.io.IOException: java.io.FileNotFoundException: No such file or 
> directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
>   at 
> org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244)
>   at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240)
>   at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>   at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195)
>   ... 7 more
> {noformat}
> This stack trace point to 
> {{LogAggregationIndexedFileController$initializeWriter}} where we do the 
> following steps (in a non-rolling log aggregation setup):
> - create FSDataOutputStream
> - writing out a UUID
> - flushing
> - immediately after that we call a GetFileStatus to get the length of the log 
> file (the bytes we just wrote out), and that's where the failures happens: 
> the file is not there yet due to eventual consistency.
> Maybe we can get rid of that, so we can use IFile format against a s3a target.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state

2020-01-20 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019413#comment-17019413
 ] 

kailiu_dev edited comment on YARN-2902 at 1/20/20 11:38 AM:


Hi , [~wilfreds]  [~jlowe] [~junping_du]  [~varun_saxena]

Hudson I find a bug,NM has many dirs such as " 
nm-local-dir/filecache/1889677_tmp" not clear and leak on disk, do you know 
why?  my hadoop version is hadoop-2.7.2


was (Author: kailiu_dev):
Hi , [~wilfreds]  [~jlowe]

Hudson I find a bug,NM has many dirs such as " 
nm-local-dir/filecache/1889677_tmp" not clear and leak on disk, do you know 
why?  my hadoop version is hadoop-2.7.2

> Killing a container that is localizing can orphan resources in the 
> DOWNLOADING state
> 
>
> Key: YARN-2902
> URL: https://issues.apache.org/jira/browse/YARN-2902
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.5.0
>Reporter: Jason Darrell Lowe
>Assignee: Varun Saxena
>Priority: Major
> Fix For: 2.8.0, 2.7.2, 2.6.4, 3.0.0-alpha1
>
> Attachments: YARN-2902-branch-2.6.01.patch, YARN-2902.002.patch, 
> YARN-2902.03.patch, YARN-2902.04.patch, YARN-2902.05.patch, 
> YARN-2902.06.patch, YARN-2902.07.patch, YARN-2902.08.patch, 
> YARN-2902.09.patch, YARN-2902.10.patch, YARN-2902.11.patch, YARN-2902.patch
>
>
> If a container is in the process of localizing when it is stopped/killed then 
> resources are left in the DOWNLOADING state.  If no other container comes 
> along and requests these resources they linger around with no reference 
> counts but aren't cleaned up during normal cache cleanup scans since it will 
> never delete resources in the DOWNLOADING state even if their reference count 
> is zero.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-109) .tmp file is not deleted for localized archives

2020-01-20 Thread kailiu_dev (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019411#comment-17019411
 ] 

kailiu_dev commented on YARN-109:
-

[~hudson]I find a bug,NM  has many dirs such as " 
nm-local-dir/filecache/1889677_tmp" not  clear and  leak on disk, do you know 
why?

> .tmp file is not deleted for localized archives
> ---
>
> Key: YARN-109
> URL: https://issues.apache.org/jira/browse/YARN-109
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 0.23.3, 2.0.0-alpha
>Reporter: Jason Darrell Lowe
>Assignee: Mayank Bansal
>Priority: Major
> Fix For: 0.23.7, 2.1.0-beta
>
> Attachments: YARN-109-trunk-1.patch, YARN-109-trunk-2.patch, 
> YARN-109-trunk-3.patch, YARN-109-trunk-4.patch, YARN-109-trunk-5.patch, 
> YARN-109-trunk.patch
>
>
> When archives are localized they are initially created as a .tmp file and 
> unpacked from that file.  However the .tmp file is not deleted afterwards.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10085) FS-CS converter: remove mixed ordering policy check

2020-01-20 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10085:

Attachment: YARN-10085-002.patch

> FS-CS converter: remove mixed ordering policy check
> ---
>
> Key: YARN-10085
> URL: https://issues.apache.org/jira/browse/YARN-10085
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-10085-001.patch, YARN-10085-002.patch
>
>
> In the converter, this part is very strict and probably unnecessary:
> {noformat}
> // Validate ordering policy
> if (queueConverter.isDrfPolicyUsedOnQueueLevel()) {
>   if (queueConverter.isFifoOrFairSharePolicyUsed()) {
> throw new ConversionException(
> "DRF ordering policy cannot be used together with fifo/fair");
>   } else {
> capacitySchedulerConfig.set(
> CapacitySchedulerConfiguration.RESOURCE_CALCULATOR_CLASS,
> DominantResourceCalculator.class.getCanonicalName());
>   }
> }
> {noformat}
> It's also misleading, because Fair policy can be used under DRF, so the error 
> message is incorrect.
> Let's remove these checks and rewrite the converter in a way that it 
> generates a valid config even if fair/drf is somehow mixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10022) Create RM Rest API to validate a CapacityScheduler Configuration

2020-01-20 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016697#comment-17016697
 ] 

Prabhu Joseph edited comment on YARN-10022 at 1/20/20 11:16 AM:


Thanks [~kmarton] for the patch. Have few comments on the patch

1. CapacityScheduler#reinitialize replaces calling validateConf() with below 
which is not required.
{code:java}
CapacitySchedulerConfigValidator.validateMemoryAllocation(this.conf);
CapacitySchedulerConfigValidator.validateVCores(this.conf);
{code}
 

2. In CapacityScheduler#reinitialize, distinguishRuleSet returned by 
validatePlacementRules can be used.
{code:java}
+CapacitySchedulerConfigValidator
+.validatePlacementRules(placementRuleStrs);
+Set distinguishRuleSet = new HashSet<>(placementRuleStrs);
{code}
 

3. In CapacitySchedulerQueueManager, there is a typo
{code:java}
-// When failing over, if using configuration store, don't validate queue
+// When failing over, if using configuration store, don't validate queueR
{code}
 

4. In RMWSConsts, method name is validateAndGetSchedulerConfiguration
{code:java}
+  /** Path for {@code RMWebServiceProtocol#validateCapacitySchedulerConfig}. */
{code}
 

5. In RMWebServices, it has to be initForWritableEndpoints. Only admin user is 
allowed to read scheduler conf,
 in order to avoid leaking sensitive info, such as ACLs. Reference: 
RMWebServices#getSchedulerConfiguration()
{code:java}
+initForReadableEndpoints();
{code}
 

6. Below lines of code is not straightforward
{code:java}
+Configuration config = new Configuration(false);
+rm.getRMContext().getRMAdminService().getConfiguration(config,
+YarnConfiguration.CS_CONFIGURATION_FILE);
+MutableCSConfigurationProvider provider
+= new MutableCSConfigurationProvider(null);
+
+CapacitySchedulerConfiguration capacitySchedulerConfig =
+new CapacitySchedulerConfiguration(config, false);
+Configuration newConfig = 
provider.applyChanges(capacitySchedulerConfig,
+mutationInfo);
{code}
can be replaced similar to the one in RMWebServices#getSchedulerConfiguration() 
like below
{code:java}
  MutableConfigurationProvider mutableConfigurationProvider =
  ((MutableConfScheduler) scheduler).getMutableConfProvider();
  Configuration schedulerConf = mutableConfigurationProvider
.getConfiguration();
  Configuration newConfig = 
  mutableConfigurationProvider.applyChanges(schedulerConf, 
mutationInfo);
{code}

*Change in AdminService.java is not required with above.

*getConfiguration() has to be added into the interface 
MutableConfigurationProvider
 

7. Error message "CS" can be expanded to "CapacityScheduler"
{code:java}
+  String errorMsg = "CS configuration validation failed: "
+  + e.toString();
{code}
 

8. Error Message is not added in the Error Response.
{code:java}
+  return Response.status(Status.BAD_REQUEST)
+  .build();
{code}
to
{code:java}
return Response.status(Status.BAD_REQUEST).entity(errorMsg)
.build();
{code}
 

9. Below error message is wrong
{code:java}
+  String errorMsg = "Configuration change only supported by " +
+  "MutableConfScheduler.";
{code}


was (Author: prabhu joseph):
Thanks [~kmarton] for the patch. Have few comments on the patch

1. CapacityScheduler#reinitialize replaces calling validateConf() with below 
which is not required.
{code:java}
CapacitySchedulerConfigValidator.validateMemoryAllocation(this.conf);
CapacitySchedulerConfigValidator.validateVCores(this.conf);
{code}
 

2. In CapacityScheduler#reinitialize, distinguishRuleSet returned by 
validatePlacementRules can be used.
{code:java}
+CapacitySchedulerConfigValidator
+.validatePlacementRules(placementRuleStrs);
+Set distinguishRuleSet = new HashSet<>(placementRuleStrs);
{code}
 

3. In CapacitySchedulerQueueManager, there is a typo
{code:java}
-// When failing over, if using configuration store, don't validate queue
+// When failing over, if using configuration store, don't validate queueR
{code}
 

4. In RMWSConsts, method name is validateAndGetSchedulerConfiguration
{code:java}
+  /** Path for {@code RMWebServiceProtocol#validateCapacitySchedulerConfig}. */
{code}
 

5. In RMWebServices, it has to be initForWritableEndpoints. Only admin user is 
allowed to read scheduler conf,
 in order to avoid leaking sensitive info, such as ACLs. Reference: 
RMWebServices#getSchedulerConfiguration()
{code:java}
+initForReadableEndpoints();
{code}
 

6. Below lines of code is not straightforward
{code:java}
+Configuration config = new Configuration(false);
+rm.getRMContext().getRMAdminService().getConfiguration(config,
+YarnConfiguration.CS_CONFIGURATION_FILE);
+

[jira] [Commented] (YARN-9462) TestResourceTrackerService.testNodeRemovalGracefully fails sporadically

2020-01-20 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019324#comment-17019324
 ] 

Prabhu Joseph commented on YARN-9462:
-

Thanks [~snemeth], have submitted  [^YARN-9462-branch-3.2.001.patch]  for 
branch-3.2.

> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> ---
>
> Key: YARN-9462
> URL: https://issues.apache.org/jira/browse/YARN-9462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: 
> TestResourceTrackerService.testNodeRemovalGracefully.txt, 
> YARN-9462-001.patch, YARN-9462-branch-3.2.001.patch
>
>
> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> {code}
> [ERROR] 
> testNodeRemovalGracefully(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService)
>   Time elapsed: 3.385 s  <<< FAILURE!
> java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtilDecomToUntracked(TestResourceTrackerService.java:2318)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtil(TestResourceTrackerService.java:2280)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully(TestResourceTrackerService.java:2133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9462) TestResourceTrackerService.testNodeRemovalGracefully fails sporadically

2020-01-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019321#comment-17019321
 ] 

Hudson commented on YARN-9462:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17876 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17876/])
YARN-9462. TestResourceTrackerService.testNodeRemovalGracefully fails (snemeth: 
rev 8b3ee2f7e9f10073a77d53eba4a6151aaadc6191)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java


> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> ---
>
> Key: YARN-9462
> URL: https://issues.apache.org/jira/browse/YARN-9462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: 
> TestResourceTrackerService.testNodeRemovalGracefully.txt, 
> YARN-9462-001.patch, YARN-9462-branch-3.2.001.patch
>
>
> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> {code}
> [ERROR] 
> testNodeRemovalGracefully(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService)
>   Time elapsed: 3.385 s  <<< FAILURE!
> java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtilDecomToUntracked(TestResourceTrackerService.java:2318)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtil(TestResourceTrackerService.java:2280)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully(TestResourceTrackerService.java:2133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9462) TestResourceTrackerService.testNodeRemovalGracefully fails sporadically

2020-01-20 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9462:

Attachment: YARN-9462-branch-3.2.001.patch

> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> ---
>
> Key: YARN-9462
> URL: https://issues.apache.org/jira/browse/YARN-9462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: 
> TestResourceTrackerService.testNodeRemovalGracefully.txt, 
> YARN-9462-001.patch, YARN-9462-branch-3.2.001.patch
>
>
> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> {code}
> [ERROR] 
> testNodeRemovalGracefully(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService)
>   Time elapsed: 3.385 s  <<< FAILURE!
> java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtilDecomToUntracked(TestResourceTrackerService.java:2318)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtil(TestResourceTrackerService.java:2280)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully(TestResourceTrackerService.java:2133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10081) Exception message from ClientRMProxy#getRMAddress is misleading

2020-01-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019311#comment-17019311
 ] 

Hudson commented on YARN-10081:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17875 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17875/])
YARN-10081. Exception message from ClientRMProxy#getRMAddress is (snemeth: rev 
57aad0f43aa34d1e522e21cdb1debf73db9f2bdc)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java


> Exception message from ClientRMProxy#getRMAddress is misleading
> ---
>
> Key: YARN-10081
> URL: https://issues.apache.org/jira/browse/YARN-10081
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Ravuri Sushma sree
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: YARN-10081.001.patch
>
>
> In {{ClientRMProxy#getRMAddress}} in the else branch we have the following 
> piece of code.
> {code:java}
> } else {
>   String message = "Unsupported protocol found when creating the proxy " +
>   "connection to ResourceManager: " +
>   ((protocol != null) ? protocol.getClass().getName() : "null");
>   LOG.error(message);
>   throw new IllegalStateException(message);
> }
> {code}
> This is wrong, because the protocol variable is of type "Class", so 
> {{Class.getClass()}} will be always {{Object}}. It should be 
> {{protocol.getName()}}. 
> An example of the error message if {{RMProxy}} is misused, and this exception 
> is thrown:
> {noformat}
> java.lang.IllegalStateException: Unsupported protocol found when creating the 
> proxy connection to ResourceManager: java.lang.Class
>   at 
> org.apache.hadoop.yarn.client.ClientRMProxy.getRMAddress(ClientRMProxy.java:109)
>   at 
> org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:133)
> ...
> {noformat}
> where obviously not a {{Object.class}} was provided to this function as 
> protocol parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8148) Update decimal values for queue capacities shown on queue status CLI

2020-01-20 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019309#comment-17019309
 ] 

Prabhu Joseph commented on YARN-8148:
-

Thanks [~snemeth].

> Update decimal values for queue capacities shown on queue status CLI
> 
>
> Key: YARN-8148
> URL: https://issues.apache.org/jira/browse/YARN-8148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-8148-002.patch, YARN-8148-002.patch, 
> YARN-8148-branch-3.1.001.patch, YARN-8148-branch-3.2.001.patch, 
> YARN-8148-branch-3.2.001.patch, YARN-8148.1.patch
>
>
> Capacities are shown with two decimal values in RM UI as part of YARN-6182. 
> The queue status cli are still showing one decimal value.
> {code}
> [root@bigdata3 yarn]# yarn queue -status default
> Queue Information : 
> Queue Name : default
>   State : RUNNING
>   Capacity : 69.9%
>   Current Capacity : .0%
>   Maximum Capacity : 70.0%
>   Default Node Label expression : 
>   Accessible Node Labels : *
>   Preemption : enabled
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9462) TestResourceTrackerService.testNodeRemovalGracefully fails sporadically

2020-01-20 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9462:
-
Fix Version/s: 3.3.0

> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> ---
>
> Key: YARN-9462
> URL: https://issues.apache.org/jira/browse/YARN-9462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: 
> TestResourceTrackerService.testNodeRemovalGracefully.txt, YARN-9462-001.patch
>
>
> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> {code}
> [ERROR] 
> testNodeRemovalGracefully(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService)
>   Time elapsed: 3.385 s  <<< FAILURE!
> java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtilDecomToUntracked(TestResourceTrackerService.java:2318)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtil(TestResourceTrackerService.java:2280)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully(TestResourceTrackerService.java:2133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9462) TestResourceTrackerService.testNodeRemovalGracefully fails sporadically

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019306#comment-17019306
 ] 

Szilard Nemeth commented on YARN-9462:
--

Hi [~prabhujoseph], 
Makes sense to me. Just committed to trunk.
[~prabhujoseph], As I see the fix version is set to 3.2.0, I ask the obvious: I 
think we need a backport patch for branch-3.2
Could you please upload it? 

Thanks.

> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> ---
>
> Key: YARN-9462
> URL: https://issues.apache.org/jira/browse/YARN-9462
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 
> TestResourceTrackerService.testNodeRemovalGracefully.txt, YARN-9462-001.patch
>
>
> TestResourceTrackerService.testNodeRemovalGracefully fails sporadically
> {code}
> [ERROR] 
> testNodeRemovalGracefully(org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService)
>   Time elapsed: 3.385 s  <<< FAILURE!
> java.lang.AssertionError: Shutdown nodes should be 0 now expected:<1> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:834)
>   at org.junit.Assert.assertEquals(Assert.java:645)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtilDecomToUntracked(TestResourceTrackerService.java:2318)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalUtil(TestResourceTrackerService.java:2280)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService.testNodeRemovalGracefully(TestResourceTrackerService.java:2133)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8148) Update decimal values for queue capacities shown on queue status CLI

2020-01-20 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019303#comment-17019303
 ] 

Hudson commented on YARN-8148:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17874 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17874/])
YARN-8148. Update decimal values for queue capacities shown on queue (snemeth: 
rev 14d0f9a775086b2f0d174818c00c118c11f0c2b6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/QueueCLI.java


> Update decimal values for queue capacities shown on queue status CLI
> 
>
> Key: YARN-8148
> URL: https://issues.apache.org/jira/browse/YARN-8148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-8148-002.patch, YARN-8148-002.patch, 
> YARN-8148-branch-3.1.001.patch, YARN-8148-branch-3.2.001.patch, 
> YARN-8148-branch-3.2.001.patch, YARN-8148.1.patch
>
>
> Capacities are shown with two decimal values in RM UI as part of YARN-6182. 
> The queue status cli are still showing one decimal value.
> {code}
> [root@bigdata3 yarn]# yarn queue -status default
> Queue Information : 
> Queue Name : default
>   State : RUNNING
>   Capacity : 69.9%
>   Current Capacity : .0%
>   Maximum Capacity : 70.0%
>   Default Node Label expression : 
>   Accessible Node Labels : *
>   Preemption : enabled
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10095) Fix help message for yarn rmadmin

2020-01-20 Thread yehuanhuan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019300#comment-17019300
 ] 

yehuanhuan commented on YARN-10095:
---

Hi Xieming Li, I found that using this command “yarn rmadmin -help 
-refreshNodes” can solve.

> Fix help message for yarn rmadmin
> -
>
> Key: YARN-10095
> URL: https://issues.apache.org/jira/browse/YARN-10095
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Minor
>
> (This issue is identified by [~aajisaka] in 
> https://issues.apache.org/jira/browse/HADOOP-16753)
> The help message of yarn rmadmin seems broken.
> Current:  
> {code:java}
> $ yarn rmadmin -help refreshNodes 2>/dev/null
> $
> $ yarn rmadmin -help refreshNodes
> Usage: yarn rmadmin
>-refreshQueues
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources
>-refreshSuperUserGroupsConfiguration
>-refreshUserToGroupsMappings
>-refreshAdminAcls
>-refreshServiceAcl
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes]
>-directlyAccessNodeLabelStore
>-refreshClusterMaxPriority
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
> or
> [NodeID] [resourcetypes] ([OvercommitTimeout]).
>-help [cmd]Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machinesThe general command line syntax is:
> command [genericOptions] [commandOptions]
> {code}
>  
>  
> Expected: 
> {code:java}
> $ yarn rmadmin -help refreshNodes 2>/dev/null
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]
> $ yarn rmadmin -help refreshNodes
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10081) Exception message from ClientRMProxy#getRMAddress is misleading

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019302#comment-17019302
 ] 

Szilard Nemeth commented on YARN-10081:
---

Hi [~Sushma_28], 
Makes sense, patch looks good to me.
Committed to trunk.
Thanks [~adam.antal] for the review.

> Exception message from ClientRMProxy#getRMAddress is misleading
> ---
>
> Key: YARN-10081
> URL: https://issues.apache.org/jira/browse/YARN-10081
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Ravuri Sushma sree
>Priority: Trivial
> Attachments: YARN-10081.001.patch
>
>
> In {{ClientRMProxy#getRMAddress}} in the else branch we have the following 
> piece of code.
> {code:java}
> } else {
>   String message = "Unsupported protocol found when creating the proxy " +
>   "connection to ResourceManager: " +
>   ((protocol != null) ? protocol.getClass().getName() : "null");
>   LOG.error(message);
>   throw new IllegalStateException(message);
> }
> {code}
> This is wrong, because the protocol variable is of type "Class", so 
> {{Class.getClass()}} will be always {{Object}}. It should be 
> {{protocol.getName()}}. 
> An example of the error message if {{RMProxy}} is misused, and this exception 
> is thrown:
> {noformat}
> java.lang.IllegalStateException: Unsupported protocol found when creating the 
> proxy connection to ResourceManager: java.lang.Class
>   at 
> org.apache.hadoop.yarn.client.ClientRMProxy.getRMAddress(ClientRMProxy.java:109)
>   at 
> org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:133)
> ...
> {noformat}
> where obviously not a {{Object.class}} was provided to this function as 
> protocol parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10081) Exception message from ClientRMProxy#getRMAddress is misleading

2020-01-20 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth resolved YARN-10081.
---
Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Exception message from ClientRMProxy#getRMAddress is misleading
> ---
>
> Key: YARN-10081
> URL: https://issues.apache.org/jira/browse/YARN-10081
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Ravuri Sushma sree
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: YARN-10081.001.patch
>
>
> In {{ClientRMProxy#getRMAddress}} in the else branch we have the following 
> piece of code.
> {code:java}
> } else {
>   String message = "Unsupported protocol found when creating the proxy " +
>   "connection to ResourceManager: " +
>   ((protocol != null) ? protocol.getClass().getName() : "null");
>   LOG.error(message);
>   throw new IllegalStateException(message);
> }
> {code}
> This is wrong, because the protocol variable is of type "Class", so 
> {{Class.getClass()}} will be always {{Object}}. It should be 
> {{protocol.getName()}}. 
> An example of the error message if {{RMProxy}} is misused, and this exception 
> is thrown:
> {noformat}
> java.lang.IllegalStateException: Unsupported protocol found when creating the 
> proxy connection to ResourceManager: java.lang.Class
>   at 
> org.apache.hadoop.yarn.client.ClientRMProxy.getRMAddress(ClientRMProxy.java:109)
>   at 
> org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:133)
> ...
> {noformat}
> where obviously not a {{Object.class}} was provided to this function as 
> protocol parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10096) Add zk based configuration provider for router

2020-01-20 Thread zhoukang (Jira)
zhoukang created YARN-10096:
---

 Summary: Add zk based configuration provider for router
 Key: YARN-10096
 URL: https://issues.apache.org/jira/browse/YARN-10096
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: router
Reporter: zhoukang
Assignee: zhoukang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8148) Update decimal values for queue capacities shown on queue status CLI

2020-01-20 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019298#comment-17019298
 ] 

Szilard Nemeth commented on YARN-8148:
--

Thanks [~prabhujoseph],
LGTM, pushed all commits to their respective branches.
Thanks [~sunilg] for the review.

> Update decimal values for queue capacities shown on queue status CLI
> 
>
> Key: YARN-8148
> URL: https://issues.apache.org/jira/browse/YARN-8148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-8148-002.patch, YARN-8148-002.patch, 
> YARN-8148-branch-3.1.001.patch, YARN-8148-branch-3.2.001.patch, 
> YARN-8148-branch-3.2.001.patch, YARN-8148.1.patch
>
>
> Capacities are shown with two decimal values in RM UI as part of YARN-6182. 
> The queue status cli are still showing one decimal value.
> {code}
> [root@bigdata3 yarn]# yarn queue -status default
> Queue Information : 
> Queue Name : default
>   State : RUNNING
>   Capacity : 69.9%
>   Current Capacity : .0%
>   Maximum Capacity : 70.0%
>   Default Node Label expression : 
>   Accessible Node Labels : *
>   Preemption : enabled
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8148) Update decimal values for queue capacities shown on queue status CLI

2020-01-20 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-8148:
-
Fix Version/s: 3.1.4
   3.2.2
   3.3.0

> Update decimal values for queue capacities shown on queue status CLI
> 
>
> Key: YARN-8148
> URL: https://issues.apache.org/jira/browse/YARN-8148
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-8148-002.patch, YARN-8148-002.patch, 
> YARN-8148-branch-3.1.001.patch, YARN-8148-branch-3.2.001.patch, 
> YARN-8148-branch-3.2.001.patch, YARN-8148.1.patch
>
>
> Capacities are shown with two decimal values in RM UI as part of YARN-6182. 
> The queue status cli are still showing one decimal value.
> {code}
> [root@bigdata3 yarn]# yarn queue -status default
> Queue Information : 
> Queue Name : default
>   State : RUNNING
>   Capacity : 69.9%
>   Current Capacity : .0%
>   Maximum Capacity : 70.0%
>   Default Node Label expression : 
>   Accessible Node Labels : *
>   Preemption : enabled
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org