[jira] [Commented] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-10 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130321#comment-17130321
 ] 

Eric Yang commented on YARN-10310:
--

[~BilwaST] Sorry, I am confused by your comment.  Are you saying after applying 
this patch, hdfs/had...@hadoop.com principal generates "already exists" error 
message?

> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10292) FS-CS converter: add an option to enable asynchronous scheduling in CapacityScheduler

2020-06-10 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10292:
-
Attachment: YARN-10292.003.branch-3.3.patch

> FS-CS converter: add an option to enable asynchronous scheduling in 
> CapacityScheduler
> -
>
> Key: YARN-10292
> URL: https://issues.apache.org/jira/browse/YARN-10292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10292.001.patch, YARN-10292.002.branch-3.3.patch, 
> YARN-10292.002.patch, YARN-10292.003.branch-3.3.patch
>
>
> FS doesn't have an equivalent setting to the CapacityScheduler's 
> yarn.scheduler.capacity.schedule-asynchronously.enable option so the FS to CS 
> converter won't add this to the yarn-site.xml. An optional command line 
> switch should be added to support this option during migration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10279) Avoid unnecessary QueueMappingEntity creations

2020-06-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130778#comment-17130778
 ] 

Hudáky Márton Gyula commented on YARN-10279:


This patch is applied on the patch in this jira: 
https://issues.apache.org/jira/browse/YARN-10281
We will retest it after the other one is merged.

> Avoid unnecessary QueueMappingEntity creations
> --
>
> Key: YARN-10279
> URL: https://issues.apache.org/jira/browse/YARN-10279
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Hudáky Márton Gyula
>Priority: Minor
> Attachments: YARN-10279.001.patch
>
>
> In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes 
> we create new instances of QueueMappingEntity class. In some cases we simply 
> copy the already received class, so we just duplicate it, which is 
> unnecessary since the class is immutable.
> This is just a minor improvement, probably doesn't have much impact, but 
> still puts some unnecessary load on GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10292) FS-CS converter: add an option to enable asynchronous scheduling in CapacityScheduler

2020-06-10 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10292:
-
Attachment: YARN-10292.003.branch-3.3.patch

> FS-CS converter: add an option to enable asynchronous scheduling in 
> CapacityScheduler
> -
>
> Key: YARN-10292
> URL: https://issues.apache.org/jira/browse/YARN-10292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10292.001.patch, YARN-10292.002.branch-3.3.patch, 
> YARN-10292.002.patch, YARN-10292.003.branch-3.3.patch
>
>
> FS doesn't have an equivalent setting to the CapacityScheduler's 
> yarn.scheduler.capacity.schedule-asynchronously.enable option so the FS to CS 
> converter won't add this to the yarn-site.xml. An optional command line 
> switch should be added to support this option during migration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10292) FS-CS converter: add an option to enable asynchronous scheduling in CapacityScheduler

2020-06-10 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke updated YARN-10292:
-
Attachment: (was: YARN-10292.003.branch-3.3.patch)

> FS-CS converter: add an option to enable asynchronous scheduling in 
> CapacityScheduler
> -
>
> Key: YARN-10292
> URL: https://issues.apache.org/jira/browse/YARN-10292
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10292.001.patch, YARN-10292.002.branch-3.3.patch, 
> YARN-10292.002.patch
>
>
> FS doesn't have an equivalent setting to the CapacityScheduler's 
> yarn.scheduler.capacity.schedule-asynchronously.enable option so the FS to CS 
> converter won't add this to the yarn-site.xml. An optional command line 
> switch should be added to support this option during migration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-10 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130360#comment-17130360
 ] 

Bilwa S T commented on YARN-10310:
--

Hi [~eyang] i meant patch is working fine. its working as expected

> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9641) ApplicationMasterService#AllocateResponseLock need to hold only ResponseID

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-9641:
---

Assignee: (was: Bilwa S T)

> ApplicationMasterService#AllocateResponseLock  need to hold only ResponseID
> ---
>
> Key: YARN-9641
> URL: https://issues.apache.org/jira/browse/YARN-9641
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin Chundatt
>Priority: Major
>
> ApplicationMasterService ApplicationResponseLock could occupy too much memory.
> # 5K+ node
> # AllocateResponse could contain 5K nodes NodeReport, Container Allocated 
> data too..
> Only the responseId is used in ApplicationMaster Service.. No need to hold 
> all the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8671) Container Launch failed stating "TaskAttempt killed because it ran on unusable node , Container released on a *lost* node"

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-8671:
---

Assignee: (was: Bilwa S T)

> Container Launch failed stating "TaskAttempt killed because it ran on 
> unusable node , Container released on a *lost* node"
> --
>
> Key: YARN-8671
> URL: https://issues.apache.org/jira/browse/YARN-8671
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.1
>Reporter: Akshay Agarwal
>Priority: Major
>
> Pre-requisites:
> {code:java}
> 1. Install HA cluster.
> 2.Set yarn.nodemanager.opportunistic-containers-max-queue-length=(positive 
> integer value)[NodeManager->yarnsite.xml]
> 3. Set yarn.resourcemanager.opportunistic-container-allocation.enabled= 
> true[ResourceManager->yarnsite.xml]
> {code}
>  
> Steps to reproduce:
> {code:java}
> 1.Keep All NodeManagers Up
> 2.Stop 2 Nodemanagers and immediately follow step 3.
> 3.Submit a job with -Dmapreduce.job.num-opportunistic-maps-percent="100" and 
> run with 50 mappers  
> {code}
> Expected Result:
> {code:java}
> Job should be successfull {code}
> Actual Result:
> {code:java}
> Job is getting successfull but some containers are failing stating 
> TaskAttempt killed because it ran on unusable node , Container released on a 
> *lost* node"
> {code}
>  
> Log Details:
> {code:java}
> TaskAttempt killed because it ran on unusable node Container released on a 
> *lost* node Container launch failed for 
> container_1534149133116_0019_01_06 : java.net.ConnectException: Call From 
> hostname/IP to hostname:portNumber failed on connection exception: 
> java.net.ConnectException: 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10266) Setting debug delay to a too high number will cause NM fail to start

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10266:


Assignee: (was: Bilwa S T)

> Setting debug delay to a too high number will cause NM fail to start
> 
>
> Key: YARN-10266
> URL: https://issues.apache.org/jira/browse/YARN-10266
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Priority: Trivial
>  Labels: newbie
>
> If I set some inappropriate number for 
> {{yarn.nodemanager.delete.debug-delay-sec}}, I'd rather have functional nM 
> with some ERROR messages in the log stating that it has been disabled due to 
> illegal argument, than to have a failed NM.
> Stack trace:
> {noformat}
> java.lang.NumberFormatException: For input string: "999"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:583)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1509)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService.serviceInit(DeletionService.java:179)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9618) NodeListManager event improvement

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-9618:
---

Assignee: (was: Bilwa S T)

> NodeListManager event improvement
> -
>
> Key: YARN-9618
> URL: https://issues.apache.org/jira/browse/YARN-9618
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin Chundatt
>Priority: Critical
>
> Current implementation nodelistmanager event blocks async dispacher and can 
> cause RM crash and slowing down event processing.
> # Cluster restart with 1K running apps . Each usable event will create 1K 
> events over all events could be 5k*1k events for 5K cluster
> # Event processing is blocked till new events are added to queue.
> Solution :
> # Add another async Event handler similar to scheduler.
> # Instead of adding events to dispatcher directly call RMApp event handler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9048) Add znode hierarchy in Federation ZK State Store

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-9048:
---

Assignee: (was: Bilwa S T)

> Add znode hierarchy in Federation ZK State Store
> 
>
> Key: YARN-9048
> URL: https://issues.apache.org/jira/browse/YARN-9048
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin Chundatt
>Priority: Major
>
> Similar to YARN-2962 consider having hierarchy in ZK federation store for 
> applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5604) Add versioning for FederationStateStore

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-5604:
---

Assignee: (was: Bilwa S T)

> Add versioning for FederationStateStore
> ---
>
> Key: YARN-5604
> URL: https://issues.apache.org/jira/browse/YARN-5604
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Subramaniam Krishnan
>Priority: Major
>
> Currently we don't have versioning (null version) for the 
> FederationStateStore.This JIRA proposes add versioning support that is needed 
> to support upgrades.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10133) For yarn Federation, there is no routeradmin command to manage

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10133:


Assignee: (was: Bilwa S T)

> For yarn Federation, there is no routeradmin command to manage
> --
>
> Key: YARN-10133
> URL: https://issues.apache.org/jira/browse/YARN-10133
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation, yarn
>Reporter: Sushanta Sen
>Priority: Major
>
> In an yarn Federated cluster, there does not exist any routeradmin command to 
> manage, like HDFS has 'dfsrouteradmin' command.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130581#comment-17130581
 ] 

Hadoop QA commented on YARN-10308:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 40s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
10s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
10s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 89m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26145/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10308 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005352/YARN-10308.002.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux a99309615641 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / b735a777178 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/26145/testReport/ |
| Max. process+thread count | 776 (vs. ulimit of 5500) |
| modules 

[jira] [Assigned] (YARN-10131) In Federation,few yarn application commands do not work

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10131:


Assignee: (was: Bilwa S T)

> In Federation,few yarn application commands do not work
> ---
>
> Key: YARN-10131
> URL: https://issues.apache.org/jira/browse/YARN-10131
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: federation, yarn
>Reporter: Sushanta Sen
>Priority: Major
>
> In Federation,the below mentioned yarn application commands do not work.
> ./yarn app -updatePriority 3 -appId 
> ./yarn app -changeQueue q1 -appId 
> ./yarn application -updateLifetime 40 -appId 
> All the above prompt the same exception"Code not implemented".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10304) Create an endpoint for remote application log directory path query

2020-06-10 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130651#comment-17130651
 ] 

Adam Antal commented on YARN-10304:
---

Thanks for the updated patch [~gandras]! Looks much better.

Some additional items:
- I think 
{{LogAggregationFileControllerFactory#getGroupedLogAggregationFileControllers}} 
is unnecessary. You can use 
{{LogAggregationFileControllerFactory#getConfiguredLogAggregationFileControllerList}},
 but you have to add expose the name of the controller in 
{{LogAggregationFileController}} preferably using a getter method for its name 
(it's haven't been implemented).
- Please add the {{//JAXB needs this}} comment to the public constructors of 
the DAO objects.
- In the {{testRemoteLogDir}} test case I think a different {{Configuration}} 
class should be used. If the asserts fail, the configurations will not be 
restored and the other tests may be affected. Creating a separate configuration 
object however is better at separating the test suite's global configuration 
from this one. Also you don't have to restore the original settings in that 
case - so it's less error prone.

If these issues are fixed, we can push this forward.

> Create an endpoint for remote application log directory path query
> --
>
> Key: YARN-10304
> URL: https://issues.apache.org/jira/browse/YARN-10304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Minor
> Attachments: YARN-10304.001.patch, YARN-10304.002.patch, 
> YARN-10304.003.patch
>
>
> The logic of the aggregated log directory path determination (currently based 
> on configuration) is scattered around the codebase and duplicated multiple 
> times. By providing a separate class for creating the path for a specific 
> user, it allows for an abstraction over this logic. This could be used in 
> place of the previously duplicated logic, moreover, we could provide an 
> endpoint to query this path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10275) CapacityScheduler QueuePath object should be able to parse paths

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10275:


Assignee: (was: Bilwa S T)

> CapacityScheduler QueuePath object should be able to parse paths
> 
>
> Key: YARN-10275
> URL: https://issues.apache.org/jira/browse/YARN-10275
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Priority: Major
>
> Currently QueuePlacementRuleUtils has an extractQueuePath method, which is 
> used to split full paths to parent path +leafqueue name, all instances of 
> QueuePath are created via this method, this suggest this behaviour should be 
> part of the QueuePath object.
> We should create a constructor, which implements this logic, and remove the 
> extractQueuePath method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10121) In Federation executing yarn queue status command throws an exception

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10121:


Assignee: (was: Bilwa S T)

> In Federation executing yarn queue status command throws an exception
> -
>
> Key: YARN-10121
> URL: https://issues.apache.org/jira/browse/YARN-10121
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation, yarn
>Reporter: Sushanta Sen
>Priority: Major
>
> yarn queue status is failing, prompting an error 
> “org.apache.commons.lang.NotImplementedException: Code is not implemented”.
> {noformat}
>  ./yarn queue -status default
> Exception in thread "main" org.apache.commons.lang.NotImplementedException: 
> Code is not implemented
> at 
> org.apache.hadoop.yarn.server.router.clientrm.FederationClientInterceptor.getQueueInfo(FederationClientInterceptor.java:715)
> at 
> org.apache.hadoop.yarn.server.router.clientrm.RouterClientRMService.getQueueInfo(RouterClientRMService.java:246)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:328)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:591)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2793)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85)
> at 
> org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getQueueInfo(ApplicationClientProtocolPBClientImpl.java:341)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy8.getQueueInfo(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getQueueInfo(YarnClientImpl.java:650)
> at 
> org.apache.hadoop.yarn.client.cli.QueueCLI.listQueue(QueueCLI.java:111)
> at org.apache.hadoop.yarn.client.cli.QueueCLI.run(QueueCLI.java:78)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at org.apache.hadoop.yarn.client.cli.QueueCLI.main(QueueCLI.java:50)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10111) In Federation cluster Distributed Shell Application submission fails as YarnClient#getQueueInfo is not implemented

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10111:


Assignee: (was: Bilwa S T)

> In Federation cluster Distributed Shell Application submission fails as 
> YarnClient#getQueueInfo is not implemented
> --
>
> Key: YARN-10111
> URL: https://issues.apache.org/jira/browse/YARN-10111
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sushanta Sen
>Priority: Blocker
>
> In Federation cluster Distributed Shell Application submission fails as 
> YarnClient#getQueueInfo is not implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10304) Create an endpoint for remote application log directory path query

2020-06-10 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10304:

Attachment: YARN-10304.003.patch

> Create an endpoint for remote application log directory path query
> --
>
> Key: YARN-10304
> URL: https://issues.apache.org/jira/browse/YARN-10304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Minor
> Attachments: YARN-10304.001.patch, YARN-10304.002.patch, 
> YARN-10304.003.patch
>
>
> The logic of the aggregated log directory path determination (currently based 
> on configuration) is scattered around the codebase and duplicated multiple 
> times. By providing a separate class for creating the path for a specific 
> user, it allows for an abstraction over this logic. This could be used in 
> place of the previously duplicated logic, moreover, we could provide an 
> endpoint to query this path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-06-10 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130411#comment-17130411
 ] 

Peter Bacsko commented on YARN-9460:


Thanks for the new patch [~BilwaST]. There are some checkstyle 
"VisibilityModifier" complaints. Please validate if those are legitimate or not 
(I can see that they're package private). If they're accessed in the descendant 
classes, make them protected. If package private is legit, then add 
{{@SuppressWarnings("checkstyle:visibilitymodifier")}} to the class.

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9460.001.patch, YARN-9460.002.patch, 
> YARN-9460.003.patch, YARN-9460.004.patch
>
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10279) Avoid unnecessary QueueMappingEntity creations

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130439#comment-17130439
 ] 

Hadoop QA commented on YARN-10279:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-10279 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-10279 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005338/YARN-10279.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/26144/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> Avoid unnecessary QueueMappingEntity creations
> --
>
> Key: YARN-10279
> URL: https://issues.apache.org/jira/browse/YARN-10279
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Hudáky Márton Gyula
>Priority: Minor
> Attachments: YARN-10279.001.patch
>
>
> In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes 
> we create new instances of QueueMappingEntity class. In some cases we simply 
> copy the already received class, so we just duplicate it, which is 
> unnecessary since the class is immutable.
> This is just a minor improvement, probably doesn't have much impact, but 
> still puts some unnecessary load on GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-10 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130338#comment-17130338
 ] 

Bilwa S T commented on YARN-10310:
--

[~eyang] I tested after applying patch by launching with same name. for second 
one this exception is thrown. Without patch it never throws this exception

> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10304) Create an endpoint for remote application log directory path query

2020-06-10 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130335#comment-17130335
 ] 

Andras Gyori commented on YARN-10304:
-

Fixed the checkstyle warnings and the unittest failure.

> Create an endpoint for remote application log directory path query
> --
>
> Key: YARN-10304
> URL: https://issues.apache.org/jira/browse/YARN-10304
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Minor
> Attachments: YARN-10304.001.patch, YARN-10304.002.patch, 
> YARN-10304.003.patch
>
>
> The logic of the aggregated log directory path determination (currently based 
> on configuration) is scattered around the codebase and duplicated multiple 
> times. By providing a separate class for creating the path for a specific 
> user, it allows for an abstraction over this logic. This could be used in 
> place of the previously duplicated logic, moreover, we could provide an 
> endpoint to query this path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-06-10 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130447#comment-17130447
 ] 

Peter Bacsko commented on YARN-9460:


Another comment:

{noformat}
  if (scheduler instanceof CapacityScheduler) {
reservationsACLsManager = new 
CapacityReservationsACLsManager(scheduler,
conf);
  } else if (scheduler instanceof FairScheduler) {
reservationsACLsManager = new 
FairReservationsACLsManager(scheduler,
conf);
  }
{noformat}

This is very similar to what I mentioned above. If neither CS nor FS is set, 
then this can cause an NPE. We need a ReservationsACLsManager which does 
nothing in a final else branch, eg. {{PassThroughReservationsACLsManager}} 
which always return true for {{checkAccess()}}.

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9460.001.patch, YARN-9460.002.patch, 
> YARN-9460.003.patch, YARN-9460.004.patch
>
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10279) Avoid unnecessary QueueMappingEntity creations

2020-06-10 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hudáky Márton Gyula updated YARN-10279:
---
Attachment: (was: YARN-10279.001.patch)

> Avoid unnecessary QueueMappingEntity creations
> --
>
> Key: YARN-10279
> URL: https://issues.apache.org/jira/browse/YARN-10279
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Hudáky Márton Gyula
>Priority: Minor
> Attachments: YARN-10279.001.patch
>
>
> In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes 
> we create new instances of QueueMappingEntity class. In some cases we simply 
> copy the already received class, so we just duplicate it, which is 
> unnecessary since the class is immutable.
> This is just a minor improvement, probably doesn't have much impact, but 
> still puts some unnecessary load on GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9460) QueueACLsManager and ReservationsACLManager should not use instanceof checks

2020-06-10 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130456#comment-17130456
 ] 

Bilwa S T commented on YARN-9460:
-

Hi [~pbacsko]
 # I will update {{@SuppressWarnings("checkstyle:visibilitymodifier")}} to the 
class.
 # It will not cause NPE as there is null check in 
ClientRMService#checkReservationACLs. if ReservationsACLsManager is null then 
it returns callerUGI.getShortUserName().

> QueueACLsManager and ReservationsACLManager should not use instanceof checks
> 
>
> Key: YARN-9460
> URL: https://issues.apache.org/jira/browse/YARN-9460
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9460.001.patch, YARN-9460.002.patch, 
> YARN-9460.003.patch, YARN-9460.004.patch
>
>
> QueueACLsManager and ReservationsACLManager should not use instanceof checks 
> for the scheduler type.
> Rather, we should abstract this into two classes: Capacity and Fair variants 
> of these ACL classes.
> QueueACLsManager and ReservationsACLManager could be abstract classes, but 
> the implementation is the decision of one who will work on this jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10304) Create an endpoint for remote application log directory path query

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130465#comment-17130465
 ] 

Hadoop QA commented on YARN-10304:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m  
9s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
1s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
11s{color} | {color:green} hadoop-mapreduce-client-hs in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
52s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26143/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10304 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005322/YARN-10304.003.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit 

[jira] [Updated] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-10 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10308:
-
Attachment: YARN-10308.002.patch

> Update javadoc and variable names for keytab in yarn services as it supports 
> filesystems other than hdfs and local file system
> --
>
> Key: YARN-10308
> URL: https://issues.apache.org/jira/browse/YARN-10308
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-10308.001.patch, YARN-10308.002.patch
>
>
> 1.  Below description should be updated
> {code:java}
> @ApiModelProperty(value = "The URI of the kerberos keytab. It supports two " +
>   "schemes \"hdfs\" and \"file\". If the URI starts with \"hdfs://\" " +
>   "scheme, it indicates the path on hdfs where the keytab is stored. The 
> " +
>   "keytab will be localized by YARN and made available to AM in its 
> local" +
>   " directory. If the URI starts with \"file://\" scheme, it indicates a 
> " +
>   "path on the local host where the keytab is presumbaly installed by " +
>   "admins upfront. ")
>   public String getKeytab() {
> return keytab;
>   }
> {code}
> 2. Variables below are still named on hdfs which is confusing
> {code:java}
> if ("file".equals(keytabURI.getScheme())) {
>   LOG.info("Using a keytab from localhost: " + keytabURI);
> } else {
>   Path keytabOnhdfs = new Path(keytabURI);
>   if (!fileSystem.getFileSystem().exists(keytabOnhdfs)) {
> LOG.warn(service.getName() + "'s keytab (principalName = "
> + principalName + ") doesn't exist at: " + keytabOnhdfs);
> return;
>   }
>   LocalResource keytabRes = fileSystem.createAmResource(keytabOnhdfs,
>   LocalResourceType.FILE);
>   localResource.put(String.format(YarnServiceConstants.KEYTAB_LOCATION,
>   service.getName()), keytabRes);
>   LOG.info("Adding " + service.getName() + "'s keytab for "
>   + "localization, uri = " + keytabOnhdfs);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10308) Update javadoc and variable names for keytab in yarn services as it supports filesystems other than hdfs and local file system

2020-06-10 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130484#comment-17130484
 ] 

Bilwa S T commented on YARN-10308:
--

Hi [~eyang] i have tested patch YARN-9002 on viewfs file system. It works fine. 
I have updated patch. Please review

> Update javadoc and variable names for keytab in yarn services as it supports 
> filesystems other than hdfs and local file system
> --
>
> Key: YARN-10308
> URL: https://issues.apache.org/jira/browse/YARN-10308
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-10308.001.patch, YARN-10308.002.patch
>
>
> 1.  Below description should be updated
> {code:java}
> @ApiModelProperty(value = "The URI of the kerberos keytab. It supports two " +
>   "schemes \"hdfs\" and \"file\". If the URI starts with \"hdfs://\" " +
>   "scheme, it indicates the path on hdfs where the keytab is stored. The 
> " +
>   "keytab will be localized by YARN and made available to AM in its 
> local" +
>   " directory. If the URI starts with \"file://\" scheme, it indicates a 
> " +
>   "path on the local host where the keytab is presumbaly installed by " +
>   "admins upfront. ")
>   public String getKeytab() {
> return keytab;
>   }
> {code}
> 2. Variables below are still named on hdfs which is confusing
> {code:java}
> if ("file".equals(keytabURI.getScheme())) {
>   LOG.info("Using a keytab from localhost: " + keytabURI);
> } else {
>   Path keytabOnhdfs = new Path(keytabURI);
>   if (!fileSystem.getFileSystem().exists(keytabOnhdfs)) {
> LOG.warn(service.getName() + "'s keytab (principalName = "
> + principalName + ") doesn't exist at: " + keytabOnhdfs);
> return;
>   }
>   LocalResource keytabRes = fileSystem.createAmResource(keytabOnhdfs,
>   LocalResourceType.FILE);
>   localResource.put(String.format(YarnServiceConstants.KEYTAB_LOCATION,
>   service.getName()), keytabRes);
>   LOG.info("Adding " + service.getName() + "'s keytab for "
>   + "localization, uri = " + keytabOnhdfs);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10311) Yarn Service should support obtaining tokens from multiple name services

2020-06-10 Thread Bilwa S T (Jira)
Bilwa S T created YARN-10311:


 Summary: Yarn Service should support obtaining tokens from 
multiple name services
 Key: YARN-10311
 URL: https://issues.apache.org/jira/browse/YARN-10311
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


Currently yarn services support single name service tokens. We can add a new 
conf called

"yarn.service.hdfs-servers" for supporting this



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10310) YARN Service - User is able to launch a service with same name

2020-06-10 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130356#comment-17130356
 ] 

Eric Yang commented on YARN-10310:
--

It sounds like you need to check auth_to_local rule to make sure that the 
username is mapped correctly.

> YARN Service - User is able to launch a service with same name
> --
>
> Key: YARN-10310
> URL: https://issues.apache.org/jira/browse/YARN-10310
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10310.001.patch
>
>
> As ServiceClient uses UserGroupInformation.getCurrentUser().getUserName() to 
> get user whereas ClientRMService#submitApplication uses 
> UserGroupInformation.getCurrentUser().getShortUserName() to set application 
> username.
> In case of user with name hdfs/had...@hadoop.com. below condition fails
> ClientRMService#getApplications()
> {code:java}
> if (users != null && !users.isEmpty() &&
>   !users.contains(application.getUser())) {
> continue;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10106) Yarn logs CLI filtering by application attempt

2020-06-10 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hudáky Márton Gyula reassigned YARN-10106:
--

Assignee: Hudáky Márton Gyula  (was: Adam Antal)

> Yarn logs CLI filtering by application attempt
> --
>
> Key: YARN-10106
> URL: https://issues.apache.org/jira/browse/YARN-10106
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Hudáky Márton Gyula
>Priority: Trivial
>
> {{ContainerLogsRequest}} got a new parameter in YARN-10101, which is the 
> {{applicationAttempt}} - we can use this new parameter in Yarn logs CLI as 
> well to filter by application attempt.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9930) Support max running app logic for CapacityScheduler

2020-06-10 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9930:
---
Attachment: YARN-9930-004.patch

> Support max running app logic for CapacityScheduler
> ---
>
> Key: YARN-9930
> URL: https://issues.apache.org/jira/browse/YARN-9930
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhoukang
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9930-001.patch, YARN-9930-002.patch, 
> YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-POC01.patch, 
> YARN-9930-POC02.patch, YARN-9930-POC03.patch, YARN-9930-POC04.patch, 
> YARN-9930-POC05.patch, screenshot-1.png
>
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10295) CapacityScheduler NPE can cause apps to get stuck without resources

2020-06-10 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130896#comment-17130896
 ] 

Szilard Nemeth commented on YARN-10295:
---

Hi [~bteke],
Latest patches LGTM, committed them to their respective branches.
Resolving jira.

> CapacityScheduler NPE can cause apps to get stuck without resources
> ---
>
> Key: YARN-10295
> URL: https://issues.apache.org/jira/browse/YARN-10295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.2.2, 3.1.5
>
> Attachments: YARN-10295.001.branch-3.1.patch, 
> YARN-10295.001.branch-3.2.patch, YARN-10295.002.branch-3.1.patch, 
> YARN-10295.002.branch-3.2.patch
>
>
> When the CapacityScheduler Asynchronous scheduling is enabled and log level 
> is set to DEBUG there is an edge-case where a NullPointerException can cause 
> the scheduler thread to exit and the apps to get stuck without allocated 
> resources. Consider the following log:
> {code:java}
> 2020-05-27 10:13:49,106 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:apply(681)) - Reserved 
> container=container_e10_1590502305306_0660_01_000115, on node=host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used= with 
> resource=
> 2020-05-27 10:13:49,134 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:internalUnreserve(743)) - Application 
> application_1590502305306_0660 unreserved  on node host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used=, currently 
> has 0 at priority 11; currentReservation  on node-label=
> 2020-05-27 10:13:49,134 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(3042)) - Allocation proposal accepted
> 2020-05-27 10:13:49,163 ERROR yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[Thread-4953,5,main] threw an Exception.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1580)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1767)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1505)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:546)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:593)
> {code}
> A container gets allocated on a host, but the host doesn't have enough 
> memory, so after a short while it gets unreserved. However because the 
> scheduler thread is running asynchronously it might have entered into the 
> following if block located in 
> [CapacityScheduler.java#L1602|https://github.com/apache/hadoop/blob/7136ebbb7aa197717619c23a841d28f1c46ad40b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L1602],
>  because at the time _node.getReservedContainer()_ wasn't null. Calling it a 
> second time for getting the ApplicationAttemptId would be an NPE, as the 
> container got unreserved in the meantime.
> {code:java}
> // Do not schedule if there are any reservations to fulfill on the node
> if (node.getReservedContainer() != null) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Skipping scheduling since node " + node.getNodeID()
> + " is reserved by application " + node.getReservedContainer()
> .getContainerId().getApplicationAttemptId());
>  }
>  return null;
> }
> {code}
> A fix would be to store the container object before the if block. 
> Only branch-3.1/3.2 is affected, because the newer branches have YARN-9664 
> which indirectly fixed this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10295) CapacityScheduler NPE can cause apps to get stuck without resources

2020-06-10 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10295:
--
Fix Version/s: 3.2.2

> CapacityScheduler NPE can cause apps to get stuck without resources
> ---
>
> Key: YARN-10295
> URL: https://issues.apache.org/jira/browse/YARN-10295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.2.2, 3.1.5
>
> Attachments: YARN-10295.001.branch-3.1.patch, 
> YARN-10295.001.branch-3.2.patch, YARN-10295.002.branch-3.1.patch, 
> YARN-10295.002.branch-3.2.patch
>
>
> When the CapacityScheduler Asynchronous scheduling is enabled and log level 
> is set to DEBUG there is an edge-case where a NullPointerException can cause 
> the scheduler thread to exit and the apps to get stuck without allocated 
> resources. Consider the following log:
> {code:java}
> 2020-05-27 10:13:49,106 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:apply(681)) - Reserved 
> container=container_e10_1590502305306_0660_01_000115, on node=host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used= with 
> resource=
> 2020-05-27 10:13:49,134 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:internalUnreserve(743)) - Application 
> application_1590502305306_0660 unreserved  on node host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used=, currently 
> has 0 at priority 11; currentReservation  on node-label=
> 2020-05-27 10:13:49,134 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(3042)) - Allocation proposal accepted
> 2020-05-27 10:13:49,163 ERROR yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[Thread-4953,5,main] threw an Exception.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1580)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1767)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1505)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:546)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:593)
> {code}
> A container gets allocated on a host, but the host doesn't have enough 
> memory, so after a short while it gets unreserved. However because the 
> scheduler thread is running asynchronously it might have entered into the 
> following if block located in 
> [CapacityScheduler.java#L1602|https://github.com/apache/hadoop/blob/7136ebbb7aa197717619c23a841d28f1c46ad40b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L1602],
>  because at the time _node.getReservedContainer()_ wasn't null. Calling it a 
> second time for getting the ApplicationAttemptId would be an NPE, as the 
> container got unreserved in the meantime.
> {code:java}
> // Do not schedule if there are any reservations to fulfill on the node
> if (node.getReservedContainer() != null) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Skipping scheduling since node " + node.getNodeID()
> + " is reserved by application " + node.getReservedContainer()
> .getContainerId().getApplicationAttemptId());
>  }
>  return null;
> }
> {code}
> A fix would be to store the container object before the if block. 
> Only branch-3.1/3.2 is affected, because the newer branches have YARN-9664 
> which indirectly fixed this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10251) Show extended resources on legacy RM UI.

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132683#comment-17132683
 ] 

Hadoop QA commented on YARN-10251:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
18s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
45s{color} | {color:green} branch-2.10 passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
19s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} branch-2.10 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-common in branch-2.10 failed with 
JDK Oracle Corporation-1.7.0_95-b00. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in branch-2.10 
failed with JDK Oracle Corporation-1.7.0_95-b00. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
27s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
29s{color} | {color:green} branch-2.10 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 38s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 
53 unchanged - 2 fixed = 57 total (was 55) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
31s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkOracleCorporation-1.7.0_95-b00
 with JDK Oracle Corporation-1.7.0_95-b00 generated 4 new + 0 unchanged - 0 
fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-server-common in the patch passed with 
JDK Private Build-1.8.0_252-8u252-b09-1~16.04-b09. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 

[jira] [Updated] (YARN-10296) Make ContainerPBImpl#getId/setId synchronized

2020-06-10 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10296:
--
Fix Version/s: 3.2.2

> Make ContainerPBImpl#getId/setId synchronized
> -
>
> Key: YARN-10296
> URL: https://issues.apache.org/jira/browse/YARN-10296
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Minor
> Fix For: 3.2.2, 3.4.0
>
> Attachments: YARN-10296.001.patch, YARN-10296.002.branch-3.1.patch, 
> YARN-10296.002.branch-3.2.patch, YARN-10296.002.branch-3.3.patch, 
> YARN-10296.002.patch
>
>
> ContainerPBImpl getId and setId methods can be accessed from multiple 
> threads. In order to avoid any simultaneous accesses and race conditions 
> these methods should be synchronized.
> The idea came from the issue described in YARN-10295, however that patch is 
> only applicable to branch-3.2 and 3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10296) Make ContainerPBImpl#getId/setId synchronized

2020-06-10 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10296:
--
Fix Version/s: 3.3.1

> Make ContainerPBImpl#getId/setId synchronized
> -
>
> Key: YARN-10296
> URL: https://issues.apache.org/jira/browse/YARN-10296
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Minor
> Fix For: 3.2.2, 3.4.0, 3.3.1
>
> Attachments: YARN-10296.001.patch, YARN-10296.002.branch-3.1.patch, 
> YARN-10296.002.branch-3.2.patch, YARN-10296.002.branch-3.3.patch, 
> YARN-10296.002.patch
>
>
> ContainerPBImpl getId and setId methods can be accessed from multiple 
> threads. In order to avoid any simultaneous accesses and race conditions 
> these methods should be synchronized.
> The idea came from the issue described in YARN-10295, however that patch is 
> only applicable to branch-3.2 and 3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10296) Make ContainerPBImpl#getId/setId synchronized

2020-06-10 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10296:
--
Fix Version/s: 3.1.5

> Make ContainerPBImpl#getId/setId synchronized
> -
>
> Key: YARN-10296
> URL: https://issues.apache.org/jira/browse/YARN-10296
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Minor
> Fix For: 3.2.2, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-10296.001.patch, YARN-10296.002.branch-3.1.patch, 
> YARN-10296.002.branch-3.2.patch, YARN-10296.002.branch-3.3.patch, 
> YARN-10296.002.patch
>
>
> ContainerPBImpl getId and setId methods can be accessed from multiple 
> threads. In order to avoid any simultaneous accesses and race conditions 
> these methods should be synchronized.
> The idea came from the issue described in YARN-10295, however that patch is 
> only applicable to branch-3.2 and 3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10296) Make ContainerPBImpl#getId/setId synchronized

2020-06-10 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130848#comment-17130848
 ] 

Szilard Nemeth commented on YARN-10296:
---

Thanks [~bteke],
Committed all the other patches to their respective branches.
Closing Jira.

> Make ContainerPBImpl#getId/setId synchronized
> -
>
> Key: YARN-10296
> URL: https://issues.apache.org/jira/browse/YARN-10296
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Minor
> Fix For: 3.2.2, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-10296.001.patch, YARN-10296.002.branch-3.1.patch, 
> YARN-10296.002.branch-3.2.patch, YARN-10296.002.branch-3.3.patch, 
> YARN-10296.002.patch
>
>
> ContainerPBImpl getId and setId methods can be accessed from multiple 
> threads. In order to avoid any simultaneous accesses and race conditions 
> these methods should be synchronized.
> The idea came from the issue described in YARN-10295, however that patch is 
> only applicable to branch-3.2 and 3.1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10295) CapacityScheduler NPE can cause apps to get stuck without resources

2020-06-10 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10295:
--
Fix Version/s: 3.1.5

> CapacityScheduler NPE can cause apps to get stuck without resources
> ---
>
> Key: YARN-10295
> URL: https://issues.apache.org/jira/browse/YARN-10295
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.1.0, 3.2.0
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.1.5
>
> Attachments: YARN-10295.001.branch-3.1.patch, 
> YARN-10295.001.branch-3.2.patch, YARN-10295.002.branch-3.1.patch, 
> YARN-10295.002.branch-3.2.patch
>
>
> When the CapacityScheduler Asynchronous scheduling is enabled and log level 
> is set to DEBUG there is an edge-case where a NullPointerException can cause 
> the scheduler thread to exit and the apps to get stuck without allocated 
> resources. Consider the following log:
> {code:java}
> 2020-05-27 10:13:49,106 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:apply(681)) - Reserved 
> container=container_e10_1590502305306_0660_01_000115, on node=host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used= with 
> resource=
> 2020-05-27 10:13:49,134 INFO  fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:internalUnreserve(743)) - Application 
> application_1590502305306_0660 unreserved  on node host: 
> ctr-e148-1588963324989-31443-01-02.hwx.site:25454 #containers=14 
> available= used=, currently 
> has 0 at priority 11; currentReservation  on node-label=
> 2020-05-27 10:13:49,134 INFO  capacity.CapacityScheduler 
> (CapacityScheduler.java:tryCommit(3042)) - Allocation proposal accepted
> 2020-05-27 10:13:49,163 ERROR yarn.YarnUncaughtExceptionHandler 
> (YarnUncaughtExceptionHandler.java:uncaughtException(68)) - Thread 
> Thread[Thread-4953,5,main] threw an Exception.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1580)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1767)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1505)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:546)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:593)
> {code}
> A container gets allocated on a host, but the host doesn't have enough 
> memory, so after a short while it gets unreserved. However because the 
> scheduler thread is running asynchronously it might have entered into the 
> following if block located in 
> [CapacityScheduler.java#L1602|https://github.com/apache/hadoop/blob/7136ebbb7aa197717619c23a841d28f1c46ad40b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L1602],
>  because at the time _node.getReservedContainer()_ wasn't null. Calling it a 
> second time for getting the ApplicationAttemptId would be an NPE, as the 
> container got unreserved in the meantime.
> {code:java}
> // Do not schedule if there are any reservations to fulfill on the node
> if (node.getReservedContainer() != null) {
> if (LOG.isDebugEnabled()) {
> LOG.debug("Skipping scheduling since node " + node.getNodeID()
> + " is reserved by application " + node.getReservedContainer()
> .getContainerId().getApplicationAttemptId());
>  }
>  return null;
> }
> {code}
> A fix would be to store the container object before the if block. 
> Only branch-3.1/3.2 is affected, because the newer branches have YARN-9664 
> which indirectly fixed this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-06-10 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132565#comment-17132565
 ] 

Prabhu Joseph commented on YARN-10293:
--

Thanks [~Tao Yang] for the review.

[~wangda] Let me know if you have any comments on the latest patch. 

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch, 
> YARN-10293-003-WIP.patch, YARN-10293-004.patch, YARN-10293-005.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> 

[jira] [Commented] (YARN-10251) Show extended resources on legacy RM UI.

2020-06-10 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132583#comment-17132583
 ] 

Eric Payne commented on YARN-10251:
---

I don't think the unit tests from the branch-3.2 pre-commit build are related 
to this patch.

The following from the list above are succeeding for me locally with or without 
the patch:
{panel}
TestApplicationACLs
TestNodeBlacklistingOnAMFailures
TestWorkPreservingUnmanagedAM
TestPlacementManager
{panel}

The following from the list above are failing for me locally with or without 
the patch:
{panel}
TestFSSchedulerConfigurationStore
TestSystemMetricsPublisherForV2
TestRMEmbeddedElector
{panel}
TestFairScheduler fails intermittently both with and without the patch.

> Show extended resources on legacy RM UI.
> 
>
> Key: YARN-10251
> URL: https://issues.apache.org/jira/browse/YARN-10251
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Major
> Attachments: Legacy RM UI With Not All Resources Shown.png, Updated 
> NodesPage UI With GPU columns.png, Updated RM UI With All Resources 
> Shown.png.png, YARN-10251.003.patch, YARN-10251.004.patch, 
> YARN-10251.005.patch, YARN-10251.branch-2.10.001.patch, 
> YARN-10251.branch-2.10.002.patch, YARN-10251.branch-2.10.003.patch, 
> YARN-10251.branch-3.2.004.patch, YARN-10251.branch-3.2.005.patch
>
>
> It would be great to update the legacy RM UI to include GPU resources in the 
> overview and in the per-app sections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10251) Show extended resources on legacy RM UI.

2020-06-10 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-10251:
--
Attachment: YARN-10251.branch-2.10.005.patch

> Show extended resources on legacy RM UI.
> 
>
> Key: YARN-10251
> URL: https://issues.apache.org/jira/browse/YARN-10251
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Major
> Attachments: Legacy RM UI With Not All Resources Shown.png, Updated 
> NodesPage UI With GPU columns.png, Updated RM UI With All Resources 
> Shown.png.png, YARN-10251.003.patch, YARN-10251.004.patch, 
> YARN-10251.005.patch, YARN-10251.branch-2.10.001.patch, 
> YARN-10251.branch-2.10.002.patch, YARN-10251.branch-2.10.003.patch, 
> YARN-10251.branch-2.10.005.patch, YARN-10251.branch-3.2.004.patch, 
> YARN-10251.branch-3.2.005.patch
>
>
> It would be great to update the legacy RM UI to include GPU resources in the 
> overview and in the per-app sections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10292) FS-CS converter: add an option to enable asynchronous scheduling in CapacityScheduler

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132595#comment-17132595
 ] 

Hadoop QA commented on YARN-10292:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 23m  
8s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.3 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
32s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} branch-3.3 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-3.3 passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
44s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} branch-3.3 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 23 new + 1 unchanged - 0 fixed = 24 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 25s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}182m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26146/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005423/YARN-10292.003.branch-3.3.patch
 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux e18d6ce29373 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | branch-3.3 / 043628d |
| Default Java | Private 

[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler

2020-06-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132611#comment-17132611
 ] 

Hadoop QA commented on YARN-9930:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
47s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 41s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 749 unchanged - 0 fixed = 750 total (was 749) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m  2s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}160m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26147/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-9930 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13005425/YARN-9930-004.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 1cd1ecc9cfe7 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / b735a777178 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle | 

[jira] [Commented] (YARN-9930) Support max running app logic for CapacityScheduler

2020-06-10 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132598#comment-17132598
 ] 

Wangda Tan commented on YARN-9930:
--

Thanks [~pbacsko], it will be make sense to create a one-pager doc and talk 
about what is the behavior looks like.

For example, how this feature related to maximum-am-limit. And how refreshQueue 
works with this feature (increase #running-app-limit seems fine, but how about 
shrink #running-app-limit?).

> Support max running app logic for CapacityScheduler
> ---
>
> Key: YARN-9930
> URL: https://issues.apache.org/jira/browse/YARN-9930
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhoukang
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9930-001.patch, YARN-9930-002.patch, 
> YARN-9930-003.patch, YARN-9930-004.patch, YARN-9930-POC01.patch, 
> YARN-9930-POC02.patch, YARN-9930-POC03.patch, YARN-9930-POC04.patch, 
> YARN-9930-POC05.patch, screenshot-1.png
>
>
> In FairScheduler, there has limitation for max running which will let 
> application pending.
> But in CapacityScheduler there has no feature like max running app.Only got 
> max app,and jobs will be rejected directly on client.
> This jira i want to implement this semantic for CapacityScheduler.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org