[jira] [Updated] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-24 Thread Shengyang Sha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengyang Sha updated YARN-9195:

Attachment: YARN-9195.001.patch

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -
>
> Key: YARN-9195
> URL: https://issues.apache.org/jira/browse/YARN-9195
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.1.0
>Reporter: Shengyang Sha
>Assignee: Shengyang Sha
>Priority: Critical
> Attachments: YARN-9195.001.patch, 
> cases_to_recreate_negative_pending_requests_scenario.diff, 
> patch.YARN-9195.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-24 Thread Shengyang Sha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752008#comment-16752008
 ] 

Shengyang Sha commented on YARN-9195:
-

[~leftnoteasy] I've been occupied by some urgent things previously, so sorry 
for the late~
The patch based on the latest trunk is submitted in the attachment while there 
is one remaining issue I'd like to ask your advice.

Once AM registerApplicationMaster, 
AbstractYarnScheduler#getTransferredContainers() in RM will return all the 
alive containers (except AM container) as containers from previous attempt. 
Notice that in case of RM failover, RM will still return all the alive 
containers while AM register itself. With respect to unmanaged AM, it's 
necessary for RM to return all the alive containers to AM because RM is not 
responsible for starting its AM container. But the other types of AMs don't 
need to get the containers their already known when RM failover, and will thus 
decrease outstanding requests in AMRMClient which is unexpected.

My question is whether we should return different result in 
AbstractYarnScheduler#getTransferredContainers()  based on whether it is an 
unmanaged AM.

Pros:
We can prevent this case just by update RM with this bugfix and don't require 
applications to update its yarn client jars.

Cons:
It will make the sematics of 
RegisterApplicationMasterResponse#getContainersFromPreviousAttempts() ambiguous 
between unmanaged AM and the other types.

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -
>
> Key: YARN-9195
> URL: https://issues.apache.org/jira/browse/YARN-9195
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.1.0
>Reporter: Shengyang Sha
>Assignee: Shengyang Sha
>Priority: Critical
> Attachments: 
> cases_to_recreate_negative_pending_requests_scenario.diff, 
> patch.YARN-9195.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-01-24 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751998#comment-16751998
 ] 

Weiwei Yang commented on YARN-9209:
---

Hi [~tarunparimi]

Thanks for the patch. But this seems to be an intermittent fix. If placement 
constraint doesn't have node-partition specified, I think we should set this 
default as ANY, because we don't want to limit the placement to the queue's 
default partition.

I was aware that we have some limitations on supporting node-partition in PC, 
see my earlier comment here in YARN-8015. I agree we can have a separate JIRA 
to track this.

Cc [~leftnoteasy].

Thanks

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9039) App ACLs are not validated when serving logs from LogWebService

2019-01-24 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751972#comment-16751972
 ] 

Bibin A Chundatt commented on YARN-9039:


[~suma.shivaprasad]

Any  update on this jira.. IIRC existing Logaggregation acl check had issue .

cc :// [~sunil.gov...@gmail.com]

> App ACLs are not validated when serving logs from LogWebService
> ---
>
> Key: YARN-9039
> URL: https://issues.apache.org/jira/browse/YARN-9039
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Critical
> Attachments: YARN-9039.1.patch, YARN-9039.2.patch, YARN-9039.3.patch
>
>
> App Acls are not being validated while serving logs through REST and UI2 via 
> Log Webservice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-8193) YARN RM hangs abruptly (stops allocating resources) when running successive applications.

2019-01-24 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan reopened YARN-8193:
--

reopening for backporting to branch-2

> YARN RM hangs abruptly (stops allocating resources) when running successive 
> applications.
> -
>
> Key: YARN-8193
> URL: https://issues.apache.org/jira/browse/YARN-8193
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Critical
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-8193-branch-2-001.patch, 
> YARN-8193-branch-2.9.0-001.patch, YARN-8193.001.patch, YARN-8193.002.patch
>
>
> When running massive queries successively, at some point RM just hangs and 
> stops allocating resources. At the point RM get hangs, YARN throw 
> NullPointerException  at RegularContainerAllocator.getLocalityWaitFactor.
> There's sufficient space given to yarn.nodemanager.local-dirs (not a node 
> health issue, RM didn't report any node being unhealthy). There is no fixed 
> trigger for this (query or operation).
> This problem goes away on restarting ResourceManager. No NM restart is 
> required. 
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9234) NPE Exception Occurred on Resourcemanager

2019-01-24 Thread Amithsha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751266#comment-16751266
 ] 

Amithsha edited comment on YARN-9234 at 1/25/19 6:57 AM:
-

[~sunilg] Thanks for the comments.


was (Author: amithsha):
[~sunilg] Thanks for comments.

> NPE Exception Occurred on Resourcemanager
> -
>
> Key: YARN-9234
> URL: https://issues.apache.org/jira/browse/YARN-9234
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.9.0
>Reporter: Amithsha
>Priority: Major
>
> 2019-01-24 14:52:17,893 FATAL event.EventDispatcher (?:?(?)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:814)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:857)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:55)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:868)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1346)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1341)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1430)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1205)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1067)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1472)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:151)
>  at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9231) TestDistributedShell fix timeout

2019-01-24 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9231:

Description: TestDistributedShell test cases time out.  (was: 
TestDistributedShell runs all test cases in parallel. All test cases are 
reusing miniYarnCluster and miniHdfsCluster. When one test case completes then 
it stops the miniYarnCluster in tearDown which will affect other test cases 
running in parallel which looks causing the timeout.)

> TestDistributedShell fix timeout
> 
>
> Key: YARN-9231
> URL: https://issues.apache.org/jira/browse/YARN-9231
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestDistributedShell test cases time out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9227) DistributedShell RelativePath is not removed at end

2019-01-24 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751943#comment-16751943
 ] 

Prabhu Joseph commented on YARN-9227:
-

Testcases timeout are due to an existing bug handled as part of YARN-9231.

> DistributedShell RelativePath is not removed at end
> ---
>
> Key: YARN-9227
> URL: https://issues.apache.org/jira/browse/YARN-9227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-9227.patch, 0002-YARN-9227.patch
>
>
> DistributedShell Job does not remove the relative path which contains jars 
> and localized files.
> {code}
> [ambari-qa@ash hadoop-yarn]$ hadoop fs -ls 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017
> Found 2 items
> -rw-r--r--   3 ambari-qa hdfs  46636 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/AppMaster.jar
> -rwx--x---   3 ambari-qa hdfs  4 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/shellCommands
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9231) TestDistributedShell fix timeout

2019-01-24 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9231:

Issue Type: Bug  (was: Improvement)

> TestDistributedShell fix timeout
> 
>
> Key: YARN-9231
> URL: https://issues.apache.org/jira/browse/YARN-9231
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestDistributedShell runs all test cases in parallel. All test cases are 
> reusing miniYarnCluster and miniHdfsCluster. When one test case completes 
> then it stops the miniYarnCluster in tearDown which will affect other test 
> cases running in parallel which looks causing the timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9231) TestDistributedShell fix timeout

2019-01-24 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9231:

Description: TestDistributedShell runs all test cases in parallel. All test 
cases are reusing miniYarnCluster and miniHdfsCluster. When one test case 
completes then it stops the miniYarnCluster in tearDown which will affect other 
test cases running in parallel which looks causing the timeout.  (was: 
TestDistributedShell takes lot of time setting up Hdfs and Yarn cluster and 
Thread.sleep(2000) before each test case depending on the TimelineVersion 
annotation of the test case. Reusing Yarn and Hdfs cluster (maybe specific to 
timeline version) will save lot of time.)

> TestDistributedShell fix timeout
> 
>
> Key: YARN-9231
> URL: https://issues.apache.org/jira/browse/YARN-9231
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestDistributedShell runs all test cases in parallel. All test cases are 
> reusing miniYarnCluster and miniHdfsCluster. When one test case completes 
> then it stops the miniYarnCluster in tearDown which will affect other test 
> cases running in parallel which looks causing the timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9161) Absolute resources of capacity scheduler doesn't support GPU and FPGA

2019-01-24 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751938#comment-16751938
 ] 

Zac Zhou commented on YARN-9161:


There are some code conflicts with 
[YARN-9116|https://issues.apache.org/jira/browse/YARN-9116]. I'll submit a 
patch to resolve it shorty.

> Absolute resources of capacity scheduler doesn't support GPU and FPGA
> -
>
> Key: YARN-9161
> URL: https://issues.apache.org/jira/browse/YARN-9161
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-9161.001.patch, YARN-9161.002.patch, 
> YARN-9161.003.patch, YARN-9161.004.patch, YARN-9161.005.patch, 
> YARN-9161.006.patch
>
>
> As the enum CapacitySchedulerConfiguration.AbsoluteResourceType only has two 
> elements: memory and vcores, which would filter out absolute resources 
> configuration of gpu and fpga in 
> AbstractCSQueue.updateConfigurableResourceRequirement. 
> This issue would cause gpu and fpga can't be allocated correctly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9231) TestDistributedShell fix timeout

2019-01-24 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9231:

Summary: TestDistributedShell fix timeout  (was: TestDistributedShell reuse 
hdfs and yarn cluster for the test cases)

> TestDistributedShell fix timeout
> 
>
> Key: YARN-9231
> URL: https://issues.apache.org/jira/browse/YARN-9231
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> TestDistributedShell takes lot of time setting up Hdfs and Yarn cluster and 
> Thread.sleep(2000) before each test case depending on the TimelineVersion 
> annotation of the test case. Reusing Yarn and Hdfs cluster (maybe specific to 
> timeline version) will save lot of time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9161) Absolute resources of capacity scheduler doesn't support GPU and FPGA

2019-01-24 Thread Zac Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751934#comment-16751934
 ] 

Zac Zhou commented on YARN-9161:


[~sunilg], Thanks a lot for your comments~
{quote}I have some doubts here. Do we need this filtering?

if we use correct resource name in resource-types.xml for gpu and fpgs, below 
code (in CapacitySchedulerConfigureation) could pick it up
{quote}
I think the filter is needed. The parameter, resourceTypes, of 
updateResourceValuesFromConfig was based on AbsoluteResourceType. Now it's from 
the following code in the method updateConfigurableResourceRequirement of 
AbstractCSQueue

 
{code:java}
Set resources = Arrays.stream(clusterResource.getResources()).map(x
-> x.getName()).collect(Collectors.toSet());
{code}
With the variable resources, we can get the resources in resource-types.xml.

 

If an incorrect resource is specified in capacity-scheduler.xml, 
CapacitySchedulerConfigureation.updateResourceValuesFromConfig would ignore it. 
So that the scheduler would not use the incorrect resource for container 
allocation.

Hope I make it clear~

 

> Absolute resources of capacity scheduler doesn't support GPU and FPGA
> -
>
> Key: YARN-9161
> URL: https://issues.apache.org/jira/browse/YARN-9161
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Zac Zhou
>Assignee: Zac Zhou
>Priority: Major
> Attachments: YARN-9161.001.patch, YARN-9161.002.patch, 
> YARN-9161.003.patch, YARN-9161.004.patch, YARN-9161.005.patch, 
> YARN-9161.006.patch
>
>
> As the enum CapacitySchedulerConfiguration.AbsoluteResourceType only has two 
> elements: memory and vcores, which would filter out absolute resources 
> configuration of gpu and fpga in 
> AbstractCSQueue.updateConfigurableResourceRequirement. 
> This issue would cause gpu and fpga can't be allocated correctly



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-01-24 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751925#comment-16751925
 ] 

Bibin A Chundatt commented on YARN-9233:


Thank you   [~BilwaST] for raising this issue

If the containers state transition is from ALLOCATED to FINISHED / KILLED . its 
not required to set to AM. 
AM counting completed containers could  wait for AM completion.


> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-9060:
---
Attachment: YARN-9060-trunk.011.patch

> [YARN-8851] Phase 1 - Support device isolation in native container-executor
> ---
>
> Key: YARN-9060
> URL: https://issues.apache.org/jira/browse/YARN-9060
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, 
> YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, 
> YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, 
> YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, 
> YARN-9060-trunk.009.patch, YARN-9060-trunk.010.patch, 
> YARN-9060-trunk.011.patch
>
>
> Due to the cgroups v1 implementation policy in linux kernel, we cannot update 
> the value of the device cgroups controller unless we have the root permission 
> ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]).
>  So we need to support this in container-executor for Java layer to invoke.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6539) Create SecureLogin inside Router

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751916#comment-16751916
 ] 

Hadoop QA commented on YARN-6539:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 48s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 217 unchanged - 0 fixed = 222 total (was 217) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 57s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
37s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
24s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
40s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-6539 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952116/YARN-6359_1.patch |
| Optional 

[jira] [Updated] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-01-24 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated YARN-9209:
---
Attachment: YARN-9209.001.patch

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9209) When nodePartition is not set in Placement Constraints, containers are allocated only in default partition

2019-01-24 Thread Tarun Parimi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi reassigned YARN-9209:
--

Assignee: Tarun Parimi

> When nodePartition is not set in Placement Constraints, containers are 
> allocated only in default partition
> --
>
> Key: YARN-9209
> URL: https://issues.apache.org/jira/browse/YARN-9209
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9209.001.patch
>
>
> When application sets a placement constraint without specifying a 
> nodePartition, the default partition is always chosen as the constraint when 
> allocating containers. This can be a problem. when an application is 
> submitted to a queue which has doesn't have enough capacity available on the 
> default partition.
>  This is a common scenario when node labels are configured for a particular 
> queue. The below sample sleeper service cannot get even a single container 
> allocated when it is submitted to a "labeled_queue", even though enough 
> capacity is available on the label/partition configured for the queue. Only 
> the AM container runs. 
> {code:java}{
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ]
> }
> ]
> }
> }
> ]
> }
> {code}
> It runs fine if I specify the node_partition explicitly in the constraints 
> like below. 
> {code:java}
> {
> "name": "sleeper-service",
> "version": "1.0.0",
> "queue": "labeled_queue",
> "components": [
> {
> "name": "sleeper",
> "number_of_containers": 2,
> "launch_command": "sleep 9",
> "resource": {
> "cpus": 1,
> "memory": "4096"
> },
> "placement_policy": {
> "constraints": [
> {
> "type": "ANTI_AFFINITY",
> "scope": "NODE",
> "target_tags": [
> "sleeper"
> ],
> "node_partitions": [
> "label"
> ]
> }
> ]
> }
> }
> ]
> }
> {code} 
> The problem seems to be because only the default partition "" is considered 
> when node_partition constraint is not specified as seen in below RM log. 
> {code:java}
> 2019-01-17 16:51:59,921 INFO placement.SingleConstraintAppPlacementAllocator 
> (SingleConstraintAppPlacementAllocator.java:validateAndSetSchedulingRequest(367))
>  - Successfully added SchedulingRequest to 
> app=appattempt_1547734161165_0010_01 targetAllocationTags=[sleeper]. 
> nodePartition= 
> {code} 
> However, I think it makes more sense to consider "*" or the 
> {{default-node-label-expression}} of the queue if configured, when no 
> node_partition is specified in the placement constraint. Since not specifying 
> any node_partition should ideally mean we don't enforce placement constraints 
> on any node_partition. However we are enforcing the default partition instead 
> now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751915#comment-16751915
 ] 

Hadoop QA commented on YARN-9060:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
30s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 31s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 4 unchanged - 0 fixed = 5 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}140m  9s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
49s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}256m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9060 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956245/YARN-9060-trunk.010.patch
 |
| Optional Tests |  dupname  asflicense  compile  cc  

[jira] [Commented] (YARN-8498) Yarn NodeManager OOM Listener Fails Compilation on Ubuntu 18.04

2019-01-24 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751903#comment-16751903
 ] 

Vinayakumar B commented on YARN-8498:
-

Yes. We can get this in.

> Yarn NodeManager OOM Listener Fails Compilation on Ubuntu 18.04
> ---
>
> Key: YARN-8498
> URL: https://issues.apache.org/jira/browse/YARN-8498
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Ayush Saxena
>Priority: Blocker
> Attachments: YARN-8498-02.patch, YARN-8948-01.patch
>
>
> While building this project, I ran into a few compilation errors here. The 
> first one was in this file:
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener_main.c
> At the very end, during the compilation of the OOM test, it fails again:
>  
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:256:7:
>  error: ‘__WAIT_STATUS’ was not declared in this scope
>  __WAIT_STATUS mem_hog_status = {};
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:257:30:
>  error: ‘mem_hog_status’ was not declared in this scope
>  __pid_t exited0 = wait(mem_hog_status);
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:275:21:
>  error: expected ‘;’ before ‘oom_listener_status’
>  __WAIT_STATUS oom_listener_status = {};
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:276:30:
>  error: ‘oom_listener_status’ was not declared in this scope
>  __pid_t exited1 = wait(oom_listener_status);
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5336) Limit the flow name size & consider cleanup for hex chars

2019-01-24 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751899#comment-16751899
 ] 

Vrushali C commented on YARN-5336:
--

Also, if we are updating the patch, could you add some comments around the 
config variables added in YarnConfiguration.java? 

> Limit the flow name size & consider cleanup for hex chars
> -
>
> Key: YARN-5336
> URL: https://issues.apache.org/jira/browse/YARN-5336
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Sushil Ks
>Priority: Major
>  Labels: YARN-5355
> Attachments: YARN-5336.001.patch, YARN-5336.002.patch
>
>
> As recommended by [~jrottinghuis] , need to add in some limit (default and 
> configurable) for accepting key values to be written to the backend.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9150) Making TimelineSchemaCreator support different backends for Timeline Schema Creation in ATSv2

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751894#comment-16751894
 ] 

Hadoop QA commented on YARN-9150:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-9150 does not apply to branch-2. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9150 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12955192/YARN-9150-branch-2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23178/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Making TimelineSchemaCreator support different backends for Timeline Schema 
> Creation in ATSv2
> -
>
> Key: YARN-9150
> URL: https://issues.apache.org/jira/browse/YARN-9150
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: ATSv2
>Reporter: Sushil Ks
>Assignee: Sushil Ks
>Priority: Major
> Attachments: YARN-9150-branch-2.patch, YARN-9150.001.patch, 
> YARN-9150.002.patch, jenkins_build.png
>
>
> h3. Currently the TimelineSchemaCreator has a concrete implementation for 
> creating Timeline Schema's only for HBase, Hence creating this JIRA for 
> supporting multiple back-ends that ATSv2 can support.
> *Usage:*
>    Add the following property in *yarn-site.xml*
> {code:java}
> 
> 
>  yarn.timeline-service.schema-creator.class
>  
>  YOUR_TIMELINE_SCHEMA_CREATOR_CLASS 
> 
> {code}
>     The Command needed to run the TimelineSchemaCreator need not be changed 
> i.e the below existing command can be used irrespective of the backend 
> configured.
> {code:java}
> bin/hadoop 
> org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator 
> -create
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5336) Limit the flow name size & consider cleanup for hex chars

2019-01-24 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751898#comment-16751898
 ] 

Vrushali C commented on YARN-5336:
--

Thanks Sushil! Patch 002 looks better but there is one place L206 in 
TimelineUtils.java. 

I was wondering about the following, what do you think:

{code}
if (length <= 0) {
  length = flowName.length();
}
return flowName.substring(0, length);
{code}

>From Java 1.7.0_06, that String.substring now has a linear complexity instead 
>of a constant one. Reference 
>bug:https://bugs.java.com/bugdatabase/view_bug.do?bug_id=4513622 

So:
- let's update code to use StringUtils for the substring.
- Also, when length is <=0, let's simply return the flowname instead of 
calculating the entire substring again as the flowname. This will help improve 
the time performance of the code. 





> Limit the flow name size & consider cleanup for hex chars
> -
>
> Key: YARN-5336
> URL: https://issues.apache.org/jira/browse/YARN-5336
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Sushil Ks
>Priority: Major
>  Labels: YARN-5355
> Attachments: YARN-5336.001.patch, YARN-5336.002.patch
>
>
> As recommended by [~jrottinghuis] , need to add in some limit (default and 
> configurable) for accepting key values to be written to the backend.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7129) Application Catalog for YARN applications

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751897#comment-16751897
 ] 

Hadoop QA commented on YARN-7129:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 16 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications . 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 50s{color} | {color:orange} root: The patch generated 302 new + 0 unchanged 
- 0 fixed = 302 total (was 0) {color} |
| {color:green}+1{color} | {color:green} hadolint {color} | {color:green}  0m  
1s{color} | {color:green} There were no new hadolint issues. {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 0s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:orange}-0{color} | {color:orange} shelldocs {color} | {color:orange}  
0m 15s{color} | {color:orange} The patch generated 160 new + 104 unchanged - 0 
fixed = 264 total (was 104) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
13s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog
 . 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-docker
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | 

[jira] [Commented] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-24 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751887#comment-16751887
 ] 

Wangda Tan commented on YARN-9195:
--

Thanks [~ssy],  

Could u rename the patch to YARN-9175.001.patch? (According to 
[https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute]).  

And once you upload the patch, you can change the Jira to "Patch Available" so 
Jenkins will run UT. 

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -
>
> Key: YARN-9195
> URL: https://issues.apache.org/jira/browse/YARN-9195
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.1.0
>Reporter: Shengyang Sha
>Priority: Critical
> Attachments: 
> cases_to_recreate_negative_pending_requests_scenario.diff, 
> patch.YARN-9195.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-24 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751889#comment-16751889
 ] 

Wangda Tan commented on YARN-9195:
--

[~ssy] add you to contributor list so you can assign Jira to yourself in the 
future.

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -
>
> Key: YARN-9195
> URL: https://issues.apache.org/jira/browse/YARN-9195
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.1.0
>Reporter: Shengyang Sha
>Assignee: Shengyang Sha
>Priority: Critical
> Attachments: 
> cases_to_recreate_negative_pending_requests_scenario.diff, 
> patch.YARN-9195.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-24 Thread Wangda Tan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reassigned YARN-9195:


Assignee: Shengyang Sha

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -
>
> Key: YARN-9195
> URL: https://issues.apache.org/jira/browse/YARN-9195
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.1.0
>Reporter: Shengyang Sha
>Assignee: Shengyang Sha
>Priority: Critical
> Attachments: 
> cases_to_recreate_negative_pending_requests_scenario.diff, 
> patch.YARN-9195.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9195) RM Queue's pending container number might get decreased unexpectedly or even become negative once RM failover

2019-01-24 Thread Shengyang Sha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengyang Sha updated YARN-9195:

Attachment: patch.YARN-9195.diff

> RM Queue's pending container number might get decreased unexpectedly or even 
> become negative once RM failover
> -
>
> Key: YARN-9195
> URL: https://issues.apache.org/jira/browse/YARN-9195
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.1.0
>Reporter: Shengyang Sha
>Priority: Critical
> Attachments: 
> cases_to_recreate_negative_pending_requests_scenario.diff, 
> patch.YARN-9195.diff
>
>
> Hi, all:
> Previously we have encountered a serious problem in ResourceManager, we found 
> that pending container number of one RM queue became negative after RM failed 
> over. Since queues in RM are managed in hierarchical structure, the root 
> queue's pending containers became negative at last, thus the scheduling 
> process of the whole cluster became affected.
> The version of both our RM server and AMRM client in our application are 
> based on yarn 3.1, and we uses AMRMClientAsync#addSchedulingRequests() method 
> in our application to request resources from RM.
> After investigation, we found that the direct cause was numAllocations of 
> some AMs' requests became negative after RM failed over. And there are at 
> lease three necessary conditions:
> (1) Use schedulingRequests in AMRM client, and the application set zero to 
> the numAllocations for a schedulingRequest. In our batch job scenario, the 
> numAllocations of a schedulingRequest could turn to zero because 
> theoretically we can run a full batch job using only one container.
> (2) RM failovers.
> (3) Before AM reregisters itself to RM after RM restarts, RM has already 
> recovered some of the application's containers assigned before.
> Here are some more details about the implementation:
> (1) After RM recovers, RM will send all alive containers to AM once it 
> re-register itself through 
> RegisterApplicationMasterResponse#getContainersFromPreviousAttempts.
> (2) During registerApplicationMaster, AMRMClientImpl will 
> removeFromOutstandingSchedulingRequests once AM gets 
> ContainersFromPreviousAttempts without checking whether these containers have 
> been assigned before. As a consequence, its outstanding requests might be 
> decreased unexpectedly even if it may not become negative.
> (3) There is no sanity check in RM to validate requests from AMs.
> For better illustrating this case, I've written a test case based on the 
> latest hadoop trunk, posted in the attachment. You may try case 
> testAMRMClientWithNegativePendingRequestsOnRMRestart and 
> testAMRMClientOnUnexpectedlyDecreasedPendingRequestsOnRMRestart .
> To solve this issue, I propose to filter allocated containers before 
> removeFromOutstandingSchedulingRequests in AMRMClientImpl during 
> registerApplicationMaster, and some sanity checks are also needed to prevent 
> things from getting worse.
> More comments and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9188) Port YARN-7136 to branch-2

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751886#comment-16751886
 ] 

Hadoop QA commented on YARN-9188:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-8200 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
57s{color} | {color:green} YARN-8200 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
17s{color} | {color:green} YARN-8200 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} YARN-8200 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m  
6s{color} | {color:green} YARN-8200 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
11s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in YARN-8200 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
12s{color} | {color:green} YARN-8200 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 15 new + 413 unchanged - 10 fixed = 428 total (was 423) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
24s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 14s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
2s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | 

[jira] [Commented] (YARN-9222) Print launchTime in ApplicationSummary

2019-01-24 Thread Keqiu Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751860#comment-16751860
 ] 

Keqiu Hu commented on YARN-9222:


+1 lgtm straightforward

> Print launchTime in ApplicationSummary
> --
>
> Key: YARN-9222
> URL: https://issues.apache.org/jira/browse/YARN-9222
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9222.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9086) [CSI] Run csi-driver-adaptor as aux service

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751878#comment-16751878
 ] 

Hadoop QA commented on YARN-9086:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 41s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 213 unchanged - 0 fixed = 215 total (was 213) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}104m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9086 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956260/YARN-9086.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 0ff6738818aa 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 

[jira] [Commented] (YARN-9222) Print launchTime in ApplicationSummary

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751851#comment-16751851
 ] 

Jonathan Hung commented on YARN-9222:
-

TestCapacitySchedulerMetrics.testCSMetrics passes locally for me

> Print launchTime in ApplicationSummary
> --
>
> Key: YARN-9222
> URL: https://issues.apache.org/jira/browse/YARN-9222
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9222.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9222) Print launchTime in ApplicationSummary

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751844#comment-16751844
 ] 

Hadoop QA commented on YARN-9222:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 32s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestCapacitySchedulerMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9222 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956230/YARN-9222.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2b6193edac8d 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a33ef4f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/23173/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23173/testReport/ |
| Max. process+thread count | 881 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Commented] (YARN-6616) YARN AHS shows submitTime for jobs same as startTime

2019-01-24 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751829#comment-16751829
 ] 

Prabhu Joseph commented on YARN-6616:
-

Thanks [~eepayne] for the review. The submitTime is different from launchTime. 
submitTime is the time the job is submitted to RM whereas the launchTime is the 
time first attempt (AM) is launched.

{code}
1. SubmitTime is the time submitApplication is received by RM (ClientRMService) 
from YarnClient.
ClientRMService sets submitTime to the system time.

 // call RMAppManager to submit application directly
  rmAppManager.submitApplication(submissionContext,
System.currentTimeMillis(), user);


2. StartTime is the time set by RMAppImpl after ClientRMService validates with 
queue SUBMIT_APPLICATIONS. RMAppImpl sets startTime to the system time.

if (startTime <= 0) {
this.startTime = this.systemClock.getTime();

3. LaunchTime is the time set when AM is Launched and the first attempt is 
launched. RMAppImpl sets it with event timestamp when AttemptLaunchedTransition.

app.launchTime = event.getTimestamp();
{code}

submitTime <= startTime <= launchTime <= finishTime

> YARN AHS shows submitTime for jobs same as startTime
> 
>
> Key: YARN-6616
> URL: https://issues.apache.org/jira/browse/YARN-6616
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-6616.patch, 0002-YARN-6616.patch, 
> 0003-YARN-6616.patch, 0004-YARN-6616.patch, 0005-YARN-6616.patch, 
> 0006-YARN-6616.patch, 0007-YARN-6616.patch, 0008-YARN-6616.patch, 
> ApplicationReport.poc
>
>
> YARN AHS returns startTime value for both submitTime and startTime for the 
> jobs.  Looks the code sets the submitTime with startTime value. 
> https://github.com/apache/hadoop/blob/branch-2.7.3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/dao/AppInfo.java#L80
> {code}
> curl --negotiate -u: 
> http://prabhuzeppelin3.openstacklocal:8188/ws/v1/applicationhistory/apps
> 149501553757414950155375741495016384084
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-01-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751820#comment-16751820
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] YARN-9074 does not force removal of container.  Hence, this is still 
valid issue to be addressed.

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Zhaohui Xin
>Priority: Major
>  Labels: Docker
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9086) [CSI] Run csi-driver-adaptor as aux service

2019-01-24 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9086:
--
Attachment: YARN-9086.003.patch

> [CSI] Run csi-driver-adaptor as aux service
> ---
>
> Key: YARN-9086
> URL: https://issues.apache.org/jira/browse/YARN-9086
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: CSI
> Attachments: YARN-9086.001.patch, YARN-9086.002.patch, 
> YARN-9086.003.patch
>
>
> Since the csi-driver-adaptor's runtime depends on protobuf3, we need to run 
> it with a seperate class loader. Aux service provides such ability, this 
> ticket is tracking the effort to run the adaptors as NM's aux services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-9060:
---
Attachment: YARN-9060-trunk.010.patch

> [YARN-8851] Phase 1 - Support device isolation in native container-executor
> ---
>
> Key: YARN-9060
> URL: https://issues.apache.org/jira/browse/YARN-9060
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, 
> YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, 
> YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, 
> YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, 
> YARN-9060-trunk.009.patch, YARN-9060-trunk.010.patch
>
>
> Due to the cgroups v1 implementation policy in linux kernel, we cannot update 
> the value of the device cgroups controller unless we have the root permission 
> ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]).
>  So we need to support this in container-executor for Java layer to invoke.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9188) Port YARN-7136 to branch-2

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751784#comment-16751784
 ] 

Jonathan Hung commented on YARN-9188:
-

002 fixes findbugs error

> Port YARN-7136 to branch-2
> --
>
> Key: YARN-9188
> URL: https://issues.apache.org/jira/browse/YARN-9188
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9188-YARN-8200.001.patch, 
> YARN-9188-YARN-8200.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9188) Port YARN-7136 to branch-2

2019-01-24 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9188:

Attachment: YARN-9188-YARN-8200.002.patch

> Port YARN-7136 to branch-2
> --
>
> Key: YARN-9188
> URL: https://issues.apache.org/jira/browse/YARN-9188
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9188-YARN-8200.001.patch, 
> YARN-9188-YARN-8200.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6616) YARN AHS shows submitTime for jobs same as startTime

2019-01-24 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751752#comment-16751752
 ] 

Eric Payne commented on YARN-6616:
--

[~Prabhu Joseph], Thanks for the hard work on this issue.

I'm sorry for changing my mind, but it seems that YARN-7088 added 
{{launchTime}} for the same purpose you are adding {{submitTime}}. So, I think 
they are the same value. Is that your understanding? Also, it seems that 
YARN-8218 was also created to solve the problem addressed by this JIRA.

If {{launchTime}} and {{submitTime}} are the same, then this fix becomes much 
simpler:
 - {{ApplicationReport}} does not need to be changed
 - No proto changes are necessary :)
 - The call to {{ApplicationReport.newinstance(...)}} in 
{{ApplicationHistoryManagerImpl#convertToApplicationReport}} could look like 
this:
{code:java}
-  trackingUrl, appHistory.getStartTime(), 0, appHistory.getFinishTime(),
+  trackingUrl, appHistory.getStartTime(), appHistory.getSubmitTime(), 
appHistory.getFinishTime(),
{code}

 - The 2 calls to {{ApplicationReport.newinstance(...)}} in 
{{ApplicationHistoryManagerOnTimelineStore#convertToApplicationReport}} could 
look like this:
{code:java}
-state, diagnosticsInfo, null, createdTime, finishedTime,
+state, diagnosticsInfo, null, createdTime, submitTime, 
finishedTime,
{code}

Thoughts?

> YARN AHS shows submitTime for jobs same as startTime
> 
>
> Key: YARN-6616
> URL: https://issues.apache.org/jira/browse/YARN-6616
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-6616.patch, 0002-YARN-6616.patch, 
> 0003-YARN-6616.patch, 0004-YARN-6616.patch, 0005-YARN-6616.patch, 
> 0006-YARN-6616.patch, 0007-YARN-6616.patch, 0008-YARN-6616.patch, 
> ApplicationReport.poc
>
>
> YARN AHS returns startTime value for both submitTime and startTime for the 
> jobs.  Looks the code sets the submitTime with startTime value. 
> https://github.com/apache/hadoop/blob/branch-2.7.3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/dao/AppInfo.java#L80
> {code}
> curl --negotiate -u: 
> http://prabhuzeppelin3.openstacklocal:8188/ws/v1/applicationhistory/apps
> 149501553757414950155375741495016384084
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9222) Print launchTime in ApplicationSummary

2019-01-24 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9222:

Attachment: YARN-9222.001.patch

> Print launchTime in ApplicationSummary
> --
>
> Key: YARN-9222
> URL: https://issues.apache.org/jira/browse/YARN-9222
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9222.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9222) Print launchTime in ApplicationSummary

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751747#comment-16751747
 ] 

Jonathan Hung commented on YARN-9222:
-

uploaded 001 which prints launchTime in app summary

> Print launchTime in ApplicationSummary
> --
>
> Key: YARN-9222
> URL: https://issues.apache.org/jira/browse/YARN-9222
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9222.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9221) Add a flag to enable dynamic auxiliary service feature

2019-01-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751710#comment-16751710
 ] 

Eric Yang edited comment on YARN-9221 at 1/25/19 12:29 AM:
---

[~billie.rinaldi] Patch 2 contains an error for YarnConfiguration class javadoc.

{code}
[ERROR] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:2251:
 error: malformed HTML
[ERROR]* found (<= 0 means that the file will not be checked for 
modifications
{code}

When yarn.nodemanager.aux-services.manifest.enabled is disabled.  REST API 
response 404 NotFound, when attempting to update auxiliary service.  I think 
the desired respond for disabled feature is probably 400 BAD_REQUEST or 409 
Conflict in this case.


was (Author: eyang):
[~billie.rinaldi] Patch 2 contains an error for YarnConfiguration class javadoc.

{code}
[ERROR] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:2251:
 error: malformed HTML
[ERROR]* found (<= 0 means that the file will not be checked for 
modifications
{code}

> Add a flag to enable dynamic auxiliary service feature
> --
>
> Key: YARN-9221
> URL: https://issues.apache.org/jira/browse/YARN-9221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-9221.01.patch, YARN-9221.02.patch
>
>
> Dynamic auxiliary feature enables ability to reconfigure YARN auxiliary 
> service on demand.  This feature is optional, and it would be nice to be 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7088) Add application launch time to Resource Manager REST API

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751737#comment-16751737
 ] 

Jonathan Hung commented on YARN-7088:
-

Backported to branch-3.1/branch-3.0/branch-2/branch-2.9. Thanks Arun for the 
review.

> Add application launch time to Resource Manager REST API
> 
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Kanwaljeet Sachdev
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7088-branch-2.001.patch, 
> YARN-7088-branch-3.0.001.patch, YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch, 
> YARN-7088.009.patch, YARN-7088.010.patch, YARN-7088.011.patch, 
> YARN-7088.012.patch, YARN-7088.013.patch, YARN-7088.014.patch, 
> YARN-7088.015.patch, YARN-7088.016.patch, YARN-7088.017.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its launch, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2019-01-24 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751714#comment-16751714
 ] 

Hudson commented on YARN-8867:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15824 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15824/])
YARN-8867. Added resource localization status to YARN service status (eyang: 
rev a33ef4fd311784dc15401eb54c82e78528c4f961)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerResourceIncreaseRPC.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ProtoUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ContainerManagementProtocol.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/containermanagement_protocol.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/provider/AbstractProviderService.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/api/records/LocalizationStatus.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceSet.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/provider/ProviderService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLocalizationStatusesRequest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/LocalizationState.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/NodeManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ContainerManagementProtocolPBClientImpl.java
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/launcher/TestContainerLauncher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/containerlaunch/ContainerLaunchService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/MockRunningServiceContext.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/ServiceTestUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/MockServiceAM.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ContainerManagementProtocolPBServiceImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/instance/ComponentInstance.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java
* (add) 

[jira] [Commented] (YARN-9221) Add a flag to enable dynamic auxiliary service feature

2019-01-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751710#comment-16751710
 ] 

Eric Yang commented on YARN-9221:
-

[~billie.rinaldi] Patch 2 contains an error for YarnConfiguration class javadoc.

{code}
[ERROR] 
/home/eyang/test/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:2251:
 error: malformed HTML
[ERROR]* found (<= 0 means that the file will not be checked for 
modifications
{code}

> Add a flag to enable dynamic auxiliary service feature
> --
>
> Key: YARN-9221
> URL: https://issues.apache.org/jira/browse/YARN-9221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-9221.01.patch, YARN-9221.02.patch
>
>
> Dynamic auxiliary feature enables ability to reconfigure YARN auxiliary 
> service on demand.  This feature is optional, and it would be nice to be 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9221) Add a flag to enable dynamic auxiliary service feature

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751707#comment-16751707
 ] 

Hadoop QA commented on YARN-9221:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
10s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 36s{color} | {color:orange} root: The patch generated 1 new + 292 unchanged 
- 0 fixed = 293 total (was 292) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
40s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
32s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | 

[jira] [Updated] (YARN-7129) Application Catalog for YARN applications

2019-01-24 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-7129:

Attachment: YARN-7129.017.patch

> Application Catalog for YARN applications
> -
>
> Key: YARN-7129
> URL: https://issues.apache.org/jira/browse/YARN-7129
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: applications
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN Appstore.pdf, YARN-7129.001.patch, 
> YARN-7129.002.patch, YARN-7129.003.patch, YARN-7129.004.patch, 
> YARN-7129.005.patch, YARN-7129.006.patch, YARN-7129.007.patch, 
> YARN-7129.008.patch, YARN-7129.009.patch, YARN-7129.010.patch, 
> YARN-7129.011.patch, YARN-7129.012.patch, YARN-7129.013.patch, 
> YARN-7129.014.patch, YARN-7129.015.patch, YARN-7129.016.patch, 
> YARN-7129.017.patch
>
>
> YARN native services provides web services API to improve usability of 
> application deployment on Hadoop using collection of docker images.  It would 
> be nice to have an application catalog system which provides an editorial and 
> search interface for YARN applications.  This improves usability of YARN for 
> manage the life cycle of applications.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2019-01-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751698#comment-16751698
 ] 

Eric Yang commented on YARN-8867:
-

+1 for patch 008.  Committing to trunk.

> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.001.patch, YARN-8867.002.patch, 
> YARN-8867.003.patch, YARN-8867.004.patch, YARN-8867.005.patch, 
> YARN-8867.006.patch, YARN-8867.007.patch, YARN-8867.008.patch, 
> YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2019-01-24 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751681#comment-16751681
 ] 

Chandni Singh commented on YARN-8867:
-

Ran these unit tests couple of time locally and they do pass.

> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.001.patch, YARN-8867.002.patch, 
> YARN-8867.003.patch, YARN-8867.004.patch, YARN-8867.005.patch, 
> YARN-8867.006.patch, YARN-8867.007.patch, YARN-8867.008.patch, 
> YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751656#comment-16751656
 ] 

Hadoop QA commented on YARN-8867:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 17 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 15m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 3s{color} | {color:green} root: The patch generated 0 new + 584 unchanged - 1 
fixed = 584 total (was 585) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
50s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 50s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 56s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m  
8s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
27s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
28s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | 

[jira] [Assigned] (YARN-7136) Additional Performance Improvement for Resource Profile Feature

2019-01-24 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung reassigned YARN-7136:
---

Assignee: Jonathan Hung  (was: Wangda Tan)

> Additional Performance Improvement for Resource Profile Feature
> ---
>
> Key: YARN-7136
> URL: https://issues.apache.org/jira/browse/YARN-7136
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Jonathan Hung
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: YARN-7136.001.patch, YARN-7136.YARN-3926.001.patch, 
> YARN-7136.YARN-3926.002.patch, YARN-7136.YARN-3926.003.patch, 
> YARN-7136.YARN-3926.004.patch, YARN-7136.YARN-3926.005.patch, 
> YARN-7136.YARN-3926.006.patch, YARN-7136.YARN-3926.007.patch, 
> YARN-7136.YARN-3926.008.patch, YARN-7136.YARN-3926.009.patch, 
> YARN-7136.YARN-3926.010.patch, YARN-7136.YARN-3926.011.patch, 
> YARN-7136.YARN-3926.012.patch, YARN-7136.YARN-3926.013.patch, 
> YARN-7136.YARN-3926.014.patch, YARN-7136.YARN-3926.015.patch, 
> YARN-7136.YARN-3926.016.patch, YARN-7136.branch-3.0.001.patch
>
>
> This JIRA is plan to add following misc perf improvements:
> 1) Use final int in Resources/ResourceCalculator to cache 
> #known-resource-types. (Significant improvement).
> 2) Catch Java's ArrayOutOfBound Exception instead of checking array.length 
> every time. (Significant improvement).
> 3) Avoid setUnit validation (which is a HashSet lookup) when initialize 
> default Memory/VCores ResourceInformation (Significant improvement).
> 4) Avoid unnecessary loop array in Resource#toString/hashCode. (Some 
> improvement).
> 5) Removed readOnlyResources in BaseResource. (Minor improvement).
> 6) Removed enum: MandatoryResources, use final integer instead. (Minor 
> improvement).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7136) Additional Performance Improvement for Resource Profile Feature

2019-01-24 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung reassigned YARN-7136:
---

Assignee: Wangda Tan  (was: Jonathan Hung)

> Additional Performance Improvement for Resource Profile Feature
> ---
>
> Key: YARN-7136
> URL: https://issues.apache.org/jira/browse/YARN-7136
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Critical
> Fix For: 3.0.0, 3.1.0
>
> Attachments: YARN-7136.001.patch, YARN-7136.YARN-3926.001.patch, 
> YARN-7136.YARN-3926.002.patch, YARN-7136.YARN-3926.003.patch, 
> YARN-7136.YARN-3926.004.patch, YARN-7136.YARN-3926.005.patch, 
> YARN-7136.YARN-3926.006.patch, YARN-7136.YARN-3926.007.patch, 
> YARN-7136.YARN-3926.008.patch, YARN-7136.YARN-3926.009.patch, 
> YARN-7136.YARN-3926.010.patch, YARN-7136.YARN-3926.011.patch, 
> YARN-7136.YARN-3926.012.patch, YARN-7136.YARN-3926.013.patch, 
> YARN-7136.YARN-3926.014.patch, YARN-7136.YARN-3926.015.patch, 
> YARN-7136.YARN-3926.016.patch, YARN-7136.branch-3.0.001.patch
>
>
> This JIRA is plan to add following misc perf improvements:
> 1) Use final int in Resources/ResourceCalculator to cache 
> #known-resource-types. (Significant improvement).
> 2) Catch Java's ArrayOutOfBound Exception instead of checking array.length 
> every time. (Significant improvement).
> 3) Avoid setUnit validation (which is a HashSet lookup) when initialize 
> default Memory/VCores ResourceInformation (Significant improvement).
> 4) Avoid unnecessary loop array in Resource#toString/hashCode. (Some 
> improvement).
> 5) Removed readOnlyResources in BaseResource. (Minor improvement).
> 6) Removed enum: MandatoryResources, use final integer instead. (Minor 
> improvement).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9188) Port YARN-7136 to branch-2

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751624#comment-16751624
 ] 

Hadoop QA commented on YARN-9188:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-8200 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
45s{color} | {color:green} YARN-8200 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
8s{color} | {color:green} YARN-8200 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} YARN-8200 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
38s{color} | {color:green} YARN-8200 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
10s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in YARN-8200 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
9s{color} | {color:green} YARN-8200 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
5s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 15 new + 301 unchanged - 10 fixed = 316 total (was 311) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 11s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
3s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 56s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 48s{color} 
| {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
56s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| 

[jira] [Commented] (YARN-7088) Add application launch time to Resource Manager REST API

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751619#comment-16751619
 ] 

Jonathan Hung commented on YARN-7088:
-

Thx, will commit the backports by EOD.

> Add application launch time to Resource Manager REST API
> 
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Kanwaljeet Sachdev
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7088-branch-2.001.patch, 
> YARN-7088-branch-3.0.001.patch, YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch, 
> YARN-7088.009.patch, YARN-7088.010.patch, YARN-7088.011.patch, 
> YARN-7088.012.patch, YARN-7088.013.patch, YARN-7088.014.patch, 
> YARN-7088.015.patch, YARN-7088.016.patch, YARN-7088.017.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its launch, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9221) Add a flag to enable dynamic auxiliary service feature

2019-01-24 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751591#comment-16751591
 ] 

Billie Rinaldi commented on YARN-9221:
--

Thanks [~eyang]! I improved the documentation in patch 2, as well as hopefully 
addressing the build errors.

> Add a flag to enable dynamic auxiliary service feature
> --
>
> Key: YARN-9221
> URL: https://issues.apache.org/jira/browse/YARN-9221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-9221.01.patch, YARN-9221.02.patch
>
>
> Dynamic auxiliary feature enables ability to reconfigure YARN auxiliary 
> service on demand.  This feature is optional, and it would be nice to be 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9221) Add a flag to enable dynamic auxiliary service feature

2019-01-24 Thread Billie Rinaldi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-9221:
-
Attachment: YARN-9221.02.patch

> Add a flag to enable dynamic auxiliary service feature
> --
>
> Key: YARN-9221
> URL: https://issues.apache.org/jira/browse/YARN-9221
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Billie Rinaldi
>Priority: Major
> Attachments: YARN-9221.01.patch, YARN-9221.02.patch
>
>
> Dynamic auxiliary feature enables ability to reconfigure YARN auxiliary 
> service on demand.  This feature is optional, and it would be nice to be 
> disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8901) Restart "NEVER" policy does not work with component dependency

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751563#comment-16751563
 ] 

Hadoop QA commented on YARN-8901:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  6s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
10s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
28s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8901 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956192/YARN-8901.2.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 64953a9c804e 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3c7d700 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23169/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/23169/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 750 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23169/console |
| Powered by | Apache 

[jira] [Commented] (YARN-8901) Restart "NEVER" policy does not work with component dependency

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751499#comment-16751499
 ] 

Hadoop QA commented on YARN-8901:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
46s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-8901 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956186/YARN-8901.1.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f63e39ef1362 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3c7d700 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23166/testReport/ |
| Max. process+thread count | 768 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23166/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Restart "NEVER" policy 

[jira] [Commented] (YARN-9188) Port YARN-7136 to branch-2

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751492#comment-16751492
 ] 

Hadoop QA commented on YARN-9188:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} YARN-8200 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  8m 
16s{color} | {color:red} root in YARN-8200 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
56s{color} | {color:red} hadoop-yarn in YARN-8200 failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 42s{color} | {color:orange} The patch fails to run checkstyle in hadoop-yarn 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  1m 
53s{color} | {color:red} hadoop-yarn in YARN-8200 failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-yarn-server-resourcemanager in YARN-8200 
failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
13s{color} | {color:red} hadoop-yarn-server-tests in YARN-8200 failed. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-resourcemanager in YARN-8200 
failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m 
16s{color} | {color:red} hadoop-yarn in YARN-8200 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-yarn-server-tests in YARN-8200 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  2m 
20s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 52s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} The patch fails to run checkstyle in hadoop-yarn 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  1m 
48s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-tests in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| 

[jira] [Commented] (YARN-6616) YARN AHS shows submitTime for jobs same as startTime

2019-01-24 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751491#comment-16751491
 ] 

Eric Payne commented on YARN-6616:
--

I will review today

> YARN AHS shows submitTime for jobs same as startTime
> 
>
> Key: YARN-6616
> URL: https://issues.apache.org/jira/browse/YARN-6616
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-6616.patch, 0002-YARN-6616.patch, 
> 0003-YARN-6616.patch, 0004-YARN-6616.patch, 0005-YARN-6616.patch, 
> 0006-YARN-6616.patch, 0007-YARN-6616.patch, 0008-YARN-6616.patch, 
> ApplicationReport.poc
>
>
> YARN AHS returns startTime value for both submitTime and startTime for the 
> jobs.  Looks the code sets the submitTime with startTime value. 
> https://github.com/apache/hadoop/blob/branch-2.7.3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/dao/AppInfo.java#L80
> {code}
> curl --negotiate -u: 
> http://prabhuzeppelin3.openstacklocal:8188/ws/v1/applicationhistory/apps
> 149501553757414950155375741495016384084
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7088) Add application launch time to Resource Manager REST API

2019-01-24 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751482#comment-16751482
 ] 

Arun Suresh commented on YARN-7088:
---

+1 in that case

> Add application launch time to Resource Manager REST API
> 
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Kanwaljeet Sachdev
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7088-branch-2.001.patch, 
> YARN-7088-branch-3.0.001.patch, YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch, 
> YARN-7088.009.patch, YARN-7088.010.patch, YARN-7088.011.patch, 
> YARN-7088.012.patch, YARN-7088.013.patch, YARN-7088.014.patch, 
> YARN-7088.015.patch, YARN-7088.016.patch, YARN-7088.017.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its launch, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9181) Backport YARN-6232 for generic resource type usage to branch-2

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751470#comment-16751470
 ] 

Jonathan Hung commented on YARN-9181:
-

FYI I uploaded an addendum patch for a real compilation issue which is not 
handled by YARN-6761 or YARN-9177. I've committed the original patch + the 
addendum in YARN-8200.

> Backport YARN-6232 for generic resource type usage to branch-2
> --
>
> Key: YARN-9181
> URL: https://issues.apache.org/jira/browse/YARN-9181
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9181-YARN-8200.001-addendum.patch, 
> YARN-9181-YARN-8200.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9181) Backport YARN-6232 for generic resource type usage to branch-2

2019-01-24 Thread Jonathan Hung (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-9181:

Attachment: YARN-9181-YARN-8200.001-addendum.patch

> Backport YARN-6232 for generic resource type usage to branch-2
> --
>
> Key: YARN-9181
> URL: https://issues.apache.org/jira/browse/YARN-9181
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9181-YARN-8200.001-addendum.patch, 
> YARN-9181-YARN-8200.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9227) DistributedShell RelativePath is not removed at end

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751463#comment-16751463
 ] 

Hadoop QA commented on YARN-9227:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 16s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:
 The patch generated 4 new + 205 unchanged - 0 fixed = 209 total (was 205) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m  5s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9227 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956182/0002-YARN-9227.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a75dc1313fef 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3c7d700 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/23164/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/23164/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell.txt
 |
|  Test Results | 

[jira] [Commented] (YARN-7088) Add application launch time to Resource Manager REST API

2019-01-24 Thread Jonathan Hung (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751451#comment-16751451
 ] 

Jonathan Hung commented on YARN-7088:
-

Thx [~asuresh], I think the javadoc is not related:
{noformat}
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java:353:
 warning: '_' used as an identifier
[WARNING] 
/testptch/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java:353:
 warning: '_' used as an identifier{noformat}
These lines aren't modified in either backport.

> Add application launch time to Resource Manager REST API
> 
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Kanwaljeet Sachdev
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7088-branch-2.001.patch, 
> YARN-7088-branch-3.0.001.patch, YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch, 
> YARN-7088.009.patch, YARN-7088.010.patch, YARN-7088.011.patch, 
> YARN-7088.012.patch, YARN-7088.013.patch, YARN-7088.014.patch, 
> YARN-7088.015.patch, YARN-7088.016.patch, YARN-7088.017.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its launch, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7088) Add application launch time to Resource Manager REST API

2019-01-24 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751444#comment-16751444
 ] 

Arun Suresh commented on YARN-7088:
---

[~jhung] the backport for branch-2 look pretty straightforward. LGTM
Can we fix the javadoc and maybe kick off the build again ?


> Add application launch time to Resource Manager REST API
> 
>
> Key: YARN-7088
> URL: https://issues.apache.org/jira/browse/YARN-7088
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Abdullah Yousufi
>Assignee: Kanwaljeet Sachdev
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-7088-branch-2.001.patch, 
> YARN-7088-branch-3.0.001.patch, YARN-7088.001.patch, YARN-7088.002.patch, 
> YARN-7088.003.patch, YARN-7088.004.patch, YARN-7088.005.patch, 
> YARN-7088.006.patch, YARN-7088.007.patch, YARN-7088.008.patch, 
> YARN-7088.009.patch, YARN-7088.010.patch, YARN-7088.011.patch, 
> YARN-7088.012.patch, YARN-7088.013.patch, YARN-7088.014.patch, 
> YARN-7088.015.patch, YARN-7088.016.patch, YARN-7088.017.patch
>
>
> Currently, the start time in the old and new UI actually shows the app 
> submission time. There should actually be two different fields; one for the 
> app's submission and one for its launch, as well as the elapsed pending time 
> between the two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8901) Restart "NEVER" policy does not work with component dependency

2019-01-24 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751437#comment-16751437
 ] 

Suma Shivaprasad commented on YARN-8901:


Added UT

> Restart "NEVER" policy does not work with component dependency
> --
>
> Key: YARN-8901
> URL: https://issues.apache.org/jira/browse/YARN-8901
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Critical
> Attachments: YARN-8901.1.patch, YARN-8901.2.patch
>
>
> Scenario:
> 1) Launch an application with two components. master and worker. Here, worker 
> is dependent on master. ( Worker should be launched only after master is 
> launched )
> 2) Set restart_policy = NEVER for both master and worker. 
> {code:title=sample launch.json}
> {
>   "name": "mawo-hadoop-ut",
> "artifact": {
> "type": "DOCKER",
> "id": "xxx"
> },
> "configuration": {
> "env": {
>"YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": 
> "hadoop"
>  },
> "properties": {
>"docker.network": "hadoop"
> }
> },
>   "components": [{
>   "dependencies": [],
>   "resource": {
>   "memory": "2048",
>   "cpus": "1"
>   },
>   "name": "master",
> "run_privileged_container": true,
>   "number_of_containers": 1,
>   "launch_command": "start master",
> "restart_policy": "NEVER",
>   }, {
>   "dependencies": ["master"],
>   "resource": {
>   "memory": "8072",
>   "cpus": "1"
>   },
>   "name": "worker",
> "run_privileged_container": true,
>   "number_of_containers": 10,
>   "launch_command": "start worker",
> "restart_policy": "NEVER",
>   }],
>   "lifetime": -1,
>   "version": 1.0
> }{code}
> When restart policy is selected to NEVER, AM never launches Worker component. 
> It get stuck with below message. 
> {code}
> 2018-10-17 15:11:58,560 [Component  dispatcher] INFO  component.Component - 
> [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  instance.ComponentInstance - 
> [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_02] 
> Transitioned from STARTED to READY on BECOME_READY event
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed {code}
> 'NEVER' restart policy expects master component to be finished before 
> starting workers. Master component can not finish the job without workers. 
> Thus, it create a deadlock.
> The logic for 'NEVER' restart policy should be fixed to allow worker 
> components to be launched as soon as master component is in READY state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8901) Restart "NEVER" policy does not work with component dependency

2019-01-24 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8901:
---
Attachment: YARN-8901.2.patch

> Restart "NEVER" policy does not work with component dependency
> --
>
> Key: YARN-8901
> URL: https://issues.apache.org/jira/browse/YARN-8901
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Critical
> Attachments: YARN-8901.1.patch, YARN-8901.2.patch
>
>
> Scenario:
> 1) Launch an application with two components. master and worker. Here, worker 
> is dependent on master. ( Worker should be launched only after master is 
> launched )
> 2) Set restart_policy = NEVER for both master and worker. 
> {code:title=sample launch.json}
> {
>   "name": "mawo-hadoop-ut",
> "artifact": {
> "type": "DOCKER",
> "id": "xxx"
> },
> "configuration": {
> "env": {
>"YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": 
> "hadoop"
>  },
> "properties": {
>"docker.network": "hadoop"
> }
> },
>   "components": [{
>   "dependencies": [],
>   "resource": {
>   "memory": "2048",
>   "cpus": "1"
>   },
>   "name": "master",
> "run_privileged_container": true,
>   "number_of_containers": 1,
>   "launch_command": "start master",
> "restart_policy": "NEVER",
>   }, {
>   "dependencies": ["master"],
>   "resource": {
>   "memory": "8072",
>   "cpus": "1"
>   },
>   "name": "worker",
> "run_privileged_container": true,
>   "number_of_containers": 10,
>   "launch_command": "start worker",
> "restart_policy": "NEVER",
>   }],
>   "lifetime": -1,
>   "version": 1.0
> }{code}
> When restart policy is selected to NEVER, AM never launches Worker component. 
> It get stuck with below message. 
> {code}
> 2018-10-17 15:11:58,560 [Component  dispatcher] INFO  component.Component - 
> [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  instance.ComponentInstance - 
> [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_02] 
> Transitioned from STARTED to READY on BECOME_READY event
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed {code}
> 'NEVER' restart policy expects master component to be finished before 
> starting workers. Master component can not finish the job without workers. 
> Thus, it create a deadlock.
> The logic for 'NEVER' restart policy should be fixed to allow worker 
> components to be launched as soon as master component is in READY state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9188) Port YARN-7136 to branch-2

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751431#comment-16751431
 ] 

Hadoop QA commented on YARN-9188:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
44s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 10s{color} 
| {color:red} YARN-9188 does not apply to YARN-8200. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:7e20225 |
| JIRA Issue | YARN-9188 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12954352/YARN-9188-YARN-8200.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23165/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Port YARN-7136 to branch-2
> --
>
> Key: YARN-9188
> URL: https://issues.apache.org/jira/browse/YARN-9188
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9188-YARN-8200.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8901) Restart "NEVER" policy does not work with component dependency

2019-01-24 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751421#comment-16751421
 ] 

Suma Shivaprasad commented on YARN-8901:


Currently downstream components that depend on components with 
restartPolicy=NEVER/ON_FAILURE are not started until they finish. But this 
breaks the notion/assumption that when downstream components can be started 
when the upstream component reaches READY state. Reverting the behaviour for 
restartPolicy = NEVER/ON_FAILURE to be the same as ALWAYS restart policy in the 
attached patch.

If downstream components need to start up only after a certain condition is 
met, then that should be supported as a separate feature in the downstream 
component and can be addressed as part of another jira.

> Restart "NEVER" policy does not work with component dependency
> --
>
> Key: YARN-8901
> URL: https://issues.apache.org/jira/browse/YARN-8901
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Critical
>
> Scenario:
> 1) Launch an application with two components. master and worker. Here, worker 
> is dependent on master. ( Worker should be launched only after master is 
> launched )
> 2) Set restart_policy = NEVER for both master and worker. 
> {code:title=sample launch.json}
> {
>   "name": "mawo-hadoop-ut",
> "artifact": {
> "type": "DOCKER",
> "id": "xxx"
> },
> "configuration": {
> "env": {
>"YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": 
> "hadoop"
>  },
> "properties": {
>"docker.network": "hadoop"
> }
> },
>   "components": [{
>   "dependencies": [],
>   "resource": {
>   "memory": "2048",
>   "cpus": "1"
>   },
>   "name": "master",
> "run_privileged_container": true,
>   "number_of_containers": 1,
>   "launch_command": "start master",
> "restart_policy": "NEVER",
>   }, {
>   "dependencies": ["master"],
>   "resource": {
>   "memory": "8072",
>   "cpus": "1"
>   },
>   "name": "worker",
> "run_privileged_container": true,
>   "number_of_containers": 10,
>   "launch_command": "start worker",
> "restart_policy": "NEVER",
>   }],
>   "lifetime": -1,
>   "version": 1.0
> }{code}
> When restart policy is selected to NEVER, AM never launches Worker component. 
> It get stuck with below message. 
> {code}
> 2018-10-17 15:11:58,560 [Component  dispatcher] INFO  component.Component - 
> [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  instance.ComponentInstance - 
> [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_02] 
> Transitioned from STARTED to READY on BECOME_READY event
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed {code}
> 'NEVER' restart policy expects master component to be finished before 
> starting workers. Master component can not finish the job without workers. 
> Thus, it create a deadlock.
> The logic for 'NEVER' restart policy should be fixed to allow worker 
> components to be launched as soon as master component is in READY state. 



--
This message was sent by Atlassian 

[jira] [Updated] (YARN-8901) Restart "NEVER" policy does not work with component dependency

2019-01-24 Thread Suma Shivaprasad (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated YARN-8901:
---
Attachment: YARN-8901.1.patch

> Restart "NEVER" policy does not work with component dependency
> --
>
> Key: YARN-8901
> URL: https://issues.apache.org/jira/browse/YARN-8901
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Yesha Vora
>Assignee: Suma Shivaprasad
>Priority: Critical
> Attachments: YARN-8901.1.patch
>
>
> Scenario:
> 1) Launch an application with two components. master and worker. Here, worker 
> is dependent on master. ( Worker should be launched only after master is 
> launched )
> 2) Set restart_policy = NEVER for both master and worker. 
> {code:title=sample launch.json}
> {
>   "name": "mawo-hadoop-ut",
> "artifact": {
> "type": "DOCKER",
> "id": "xxx"
> },
> "configuration": {
> "env": {
>"YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": 
> "hadoop"
>  },
> "properties": {
>"docker.network": "hadoop"
> }
> },
>   "components": [{
>   "dependencies": [],
>   "resource": {
>   "memory": "2048",
>   "cpus": "1"
>   },
>   "name": "master",
> "run_privileged_container": true,
>   "number_of_containers": 1,
>   "launch_command": "start master",
> "restart_policy": "NEVER",
>   }, {
>   "dependencies": ["master"],
>   "resource": {
>   "memory": "8072",
>   "cpus": "1"
>   },
>   "name": "worker",
> "run_privileged_container": true,
>   "number_of_containers": 10,
>   "launch_command": "start worker",
> "restart_policy": "NEVER",
>   }],
>   "lifetime": -1,
>   "version": 1.0
> }{code}
> When restart policy is selected to NEVER, AM never launches Worker component. 
> It get stuck with below message. 
> {code}
> 2018-10-17 15:11:58,560 [Component  dispatcher] INFO  component.Component - 
> [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  instance.ComponentInstance - 
> [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_02] 
> Transitioned from STARTED to READY on BECOME_READY event
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed 
> 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO  component.Component - 
> [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances 
> are ready or the dependent component has not completed {code}
> 'NEVER' restart policy expects master component to be finished before 
> starting workers. Master component can not finish the job without workers. 
> Thus, it create a deadlock.
> The logic for 'NEVER' restart policy should be fixed to allow worker 
> components to be launched as soon as master component is in READY state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8867) Retrieve the status of resource localization

2019-01-24 Thread Chandni Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751419#comment-16751419
 ] 

Chandni Singh commented on YARN-8867:
-

Addressed [~eyang]'s comments, test failure in TestServiceAM and checkstyle 
warnings in patch 8.

The test failure in hadoop-yarn-server-resourcemanager seems unrelated to this 
patch. Don't see any tests in rm failing. 
{code}
[WARNING] Tests run: 2449, Failures: 0, Errors: 0, Skipped: 7
[INFO] 
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 01:26 h
[INFO] Finished at: 2019-01-24T04:40:54+00:00
[INFO] Final Memory: 24M/827M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M1:test (default-test) on 
project hadoop-yarn-server-resourcemanager: There was a timeout or other error 
in the fork -> [Help 1]
{code}


> Retrieve the status of resource localization
> 
>
> Key: YARN-8867
> URL: https://issues.apache.org/jira/browse/YARN-8867
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8867.001.patch, YARN-8867.002.patch, 
> YARN-8867.003.patch, YARN-8867.004.patch, YARN-8867.005.patch, 
> YARN-8867.006.patch, YARN-8867.007.patch, YARN-8867.008.patch, 
> YARN-8867.wip.patch
>
>
> Refer YARN-3854.
> Currently NM does not have an API to retrieve the status of localization. 
> Unless the client can know when the localization of a resource is complete 
> irrespective of the type of the resource, it cannot take any appropriate 
> action. 
> We need an API in {{ContainerManagementProtocol}} to retrieve the status on 
> the localization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9227) DistributedShell RelativePath is not removed at end

2019-01-24 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9227:

Attachment: 0002-YARN-9227.patch

> DistributedShell RelativePath is not removed at end
> ---
>
> Key: YARN-9227
> URL: https://issues.apache.org/jira/browse/YARN-9227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-9227.patch, 0002-YARN-9227.patch
>
>
> DistributedShell Job does not remove the relative path which contains jars 
> and localized files.
> {code}
> [ambari-qa@ash hadoop-yarn]$ hadoop fs -ls 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017
> Found 2 items
> -rw-r--r--   3 ambari-qa hdfs  46636 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/AppMaster.jar
> -rwx--x---   3 ambari-qa hdfs  4 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/shellCommands
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7761) [UI2] Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751385#comment-16751385
 ] 

Hadoop QA commented on YARN-7761:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
35m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-7761 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956173/YARN-7761.003.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux f1963b98bf02 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3c7d700 |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 310 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23162/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> [UI2] Clicking 'master container log' or 'Link' next to 'log' under 
> application's appAttempt goes to Old UI's Log link
> --
>
> Key: YARN-7761
> URL: https://issues.apache.org/jira/browse/YARN-7761
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sumana Sathish
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-7761.001.patch, YARN-7761.002.patch, 
> YARN-7761.003.patch
>
>
> Clicking 'master container log' or 'Link' next to 'Log' under application's 
> appAttempt goes to Old UI's Log link



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9116) Capacity Scheduler: implements queue level maximum-allocation inheritance

2019-01-24 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751380#comment-16751380
 ] 

Aihua Xu commented on YARN-9116:


Thanks [~cheersyang] and [~leftnoteasy] for your help.

> Capacity Scheduler: implements queue level maximum-allocation inheritance
> -
>
> Key: YARN-9116
> URL: https://issues.apache.org/jira/browse/YARN-9116
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 2.7.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9116.1.patch, YARN-9116.2.patch, YARN-9116.3.patch, 
> YARN-9116.4.patch, YARN-9116.5.patch
>
>
> YARN-1582 adds the support of maximum-allocation-mb configuration per queue 
> which is targeting to support larger container features on dedicated queues 
> (larger maximum-allocation-mb/maximum-allocation-vcores for such queue) . 
> While to achieve larger container configuration, we need to increase the 
> global maximum-allocation-mb/maximum-allocation-vcores (e.g. 120G/256) and 
> then override those configurations with desired values on the queues since 
> queue configuration can't be larger than cluster configuration. There are 
> many queues in the system and if we forget to configure such values when 
> adding a new queue, then such queue gets default 120G/256 which typically is 
> not what we want.  
> We can come up with a queue-default configuration (set to normal queue 
> configuration like 16G/8), so the leaf queues gets such values by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9116) Capacity Scheduler: implements queue level maximum-allocation inheritance

2019-01-24 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751359#comment-16751359
 ] 

Aihua Xu commented on YARN-9116:


The post commit is failing with the following exception randomly. There are a 
couple infra jiras for this already INFRA-13506, INFRA-17015. 

Failed to execute goal 
org.apache.hadoop:hadoop-maven-plugins:3.3.0-SNAPSHOT:protoc (compile-protoc) 
on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: 
protoc version is 'libprotoc 2.6.1', expected version is '2.5.0' -> [Help 1]

> Capacity Scheduler: implements queue level maximum-allocation inheritance
> -
>
> Key: YARN-9116
> URL: https://issues.apache.org/jira/browse/YARN-9116
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 2.7.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9116.1.patch, YARN-9116.2.patch, YARN-9116.3.patch, 
> YARN-9116.4.patch, YARN-9116.5.patch
>
>
> YARN-1582 adds the support of maximum-allocation-mb configuration per queue 
> which is targeting to support larger container features on dedicated queues 
> (larger maximum-allocation-mb/maximum-allocation-vcores for such queue) . 
> While to achieve larger container configuration, we need to increase the 
> global maximum-allocation-mb/maximum-allocation-vcores (e.g. 120G/256) and 
> then override those configurations with desired values on the queues since 
> queue configuration can't be larger than cluster configuration. There are 
> many queues in the system and if we forget to configure such values when 
> adding a new queue, then such queue gets default 120G/256 which typically is 
> not what we want.  
> We can come up with a queue-default configuration (set to normal queue 
> configuration like 16G/8), so the leaf queues gets such values by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7761) [UI2] Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link

2019-01-24 Thread Akhil PB (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751324#comment-16751324
 ] 

Akhil PB commented on YARN-7761:


Attached v3 patch, which provides log link, clicking on which redirects to UI2 
logs page by pre-populating attemptId and containerId.

> [UI2] Clicking 'master container log' or 'Link' next to 'log' under 
> application's appAttempt goes to Old UI's Log link
> --
>
> Key: YARN-7761
> URL: https://issues.apache.org/jira/browse/YARN-7761
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sumana Sathish
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-7761.001.patch, YARN-7761.002.patch, 
> YARN-7761.003.patch
>
>
> Clicking 'master container log' or 'Link' next to 'Log' under application's 
> appAttempt goes to Old UI's Log link



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7761) [UI2] Clicking 'master container log' or 'Link' next to 'log' under application's appAttempt goes to Old UI's Log link

2019-01-24 Thread Akhil PB (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-7761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-7761:
---
Attachment: YARN-7761.003.patch

> [UI2] Clicking 'master container log' or 'Link' next to 'log' under 
> application's appAttempt goes to Old UI's Log link
> --
>
> Key: YARN-7761
> URL: https://issues.apache.org/jira/browse/YARN-7761
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Sumana Sathish
>Assignee: Akhil PB
>Priority: Major
> Attachments: YARN-7761.001.patch, YARN-7761.002.patch, 
> YARN-7761.003.patch
>
>
> Clicking 'master container log' or 'Link' next to 'Log' under application's 
> appAttempt goes to Old UI's Log link



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9234) NPE Exception Occurred on Resourcemanager

2019-01-24 Thread Amithsha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751266#comment-16751266
 ] 

Amithsha commented on YARN-9234:


[~sunilg] Thanks for comments.

> NPE Exception Occurred on Resourcemanager
> -
>
> Key: YARN-9234
> URL: https://issues.apache.org/jira/browse/YARN-9234
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.9.0
>Reporter: Amithsha
>Priority: Major
>
> 2019-01-24 14:52:17,893 FATAL event.EventDispatcher (?:?(?)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:814)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:857)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:55)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:868)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1346)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1341)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1430)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1205)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1067)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1472)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:151)
>  at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9206) RMServerUtils does not count SHUTDOWN as an accepted state

2019-01-24 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751227#comment-16751227
 ] 

Jim Brennan commented on YARN-9206:
---

[~sunilg], [~kshukla]  While I agree that [~sunilg]'s version looks a little 
cleaner, the addAll() calls are not cheap, especially for the active list, 
which could be thousands of nodes.  I was attempting to do this in as optimally 
a way as possible using the isActive/isInActive approach.  That said, I am OK 
with using this version if that is the consensus.  (I'm more of a C/C++ guy, so 
the ifs and booleans don't offend me so much. :)).

 

 

> RMServerUtils does not count SHUTDOWN as an accepted state
> --
>
> Key: YARN-9206
> URL: https://issues.apache.org/jira/browse/YARN-9206
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.3
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
> Attachments: YARN-9206.001.patch, YARN-9206.002.patch, 
> YARN-9206.003.patch
>
>
> {code}
> if (acceptedStates.contains(NodeState.DECOMMISSIONED) ||
> acceptedStates.contains(NodeState.LOST) ||
> acceptedStates.contains(NodeState.REBOOTED)) {
>   for (RMNode rmNode : context.getInactiveRMNodes().values()) {
> if ((rmNode != null) && acceptedStates.contains(rmNode.getState())) {
>   results.add(rmNode);
> }
>   }
> }
> return results;
>   }
> {code}
> This should include SHUTDOWN state as they are inactive too. This method is 
> used for node reports and such so might be useful to account for them as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751176#comment-16751176
 ] 

Hadoop QA commented on YARN-9060:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
15s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 5 unchanged - 0 fixed = 10 total (was 5) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}130m 41s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
40s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}233m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9060 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956123/YARN-9060-trunk.009.patch
 |
| Optional Tests |  dupname  asflicense  compile  cc  

[jira] [Commented] (YARN-9213) RM Web UI does not show custom resource allocations for containers page

2019-01-24 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751133#comment-16751133
 ] 

Peter Bacsko commented on YARN-9213:


Just one question regarding this code:
{noformat}
161 switch (resourceName) {
162   case ResourceInformation.MEMORY_URI:
163 translatedResourceName = "Memory";
164 break;
165   case ResourceInformation.VCORES_URI:
166 translatedResourceName = "VCores";
167 break;
168   default:
169 translatedResourceName = resourceName;
170 break;
171 }
{noformat}
This is just a safety net for vcores/memory, right? Because in theory, this 
method should only be invoked for resources other than mem and vcores (an if 
condition checks that above).

If that cannot happen just simply remove this method and replace 
{{sb.append(getResourceAsString(key, value))}} with {{sb.append(key).append(" 
").append(value)}}

> RM Web UI does not show custom resource allocations for containers page
> ---
>
> Key: YARN-9213
> URL: https://issues.apache.org/jira/browse/YARN-9213
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-9213.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9098) Separate mtab file reader code and cgroups file system hierarchy parser code from CGroupsHandlerImpl and ResourceHandlerModule

2019-01-24 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751123#comment-16751123
 ] 

Peter Bacsko commented on YARN-9098:


Minor things:

1. Delete the {{tmpDir}} in {{testSelectCgroup}} after the test finished
2. Same applies to {{testMtabParsing}}
3. Use assert methods consistently, eg. either static import them or 
{{Assert.assert...()}}  (this was probably not introduced by your change, but 
it's still a nice thing to do)
4. Make {{controllerPaths}} and {{cGroupsMountConfig}} final in 
{{CGroupsHandlerImpl}} (if possible)


> Separate mtab file reader code and cgroups file system hierarchy parser code 
> from CGroupsHandlerImpl and ResourceHandlerModule
> --
>
> Key: YARN-9098
> URL: https://issues.apache.org/jira/browse/YARN-9098
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-9098.002.patch, YARN-9098.003.patch, 
> YARN-9098.004.patch
>
>
> Separate mtab file reader code and cgroups file system hierarchy parser code 
> from CGroupsHandlerImpl and ResourceHandlerModule
> CGroupsHandlerImpl has a method parseMtab that parses an mtab file and stores 
> cgroups data.
> CGroupsLCEResourcesHandler also has a method with the same name, with 
> identical code.
> The parser code should be extracted from these places and be added in a new 
> class as this is a separate responsibility.
> As the output of the file parser is a Map>, it's better 
> to encapsulate it in a domain object, named 'CGroupsMountConfig' for instance.
> ResourceHandlerModule has a method named parseConfiguredCGroupPath, that is 
> responsible for producing the same results (Map>) to 
> store cgroups data, it does not operate on mtab file, but looking at the 
> filesystem for cgroup settings. As the output is the same, CGroupsMountConfig 
> should be used here, too.
> Again, this could should not be part of ResourceHandlerModule as it is a 
> different responsibility.
> One more thing which is strongly related to the methods above is 
> CGroupsHandlerImpl.initializeFromMountConfig: This method processes the 
> result of a parsed mtab file or a parsed cgroups filesystem data and stores 
> file system paths for all available controllers. This method invokes 
> findControllerPathInMountConfig, which is a duplicated in CGroupsHandlerImpl 
> and CGroupsLCEResourcesHandler, so it should be moved to a single place. To 
> store filesystem path and controller mappings, a new domain object could be 
> introduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9099) GpuResourceAllocator.getReleasingGpus calculates number of GPUs in a wrong way

2019-01-24 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751126#comment-16751126
 ] 

Peter Bacsko commented on YARN-9099:


[~snemeth] as [~tangzhankun] pointed out, is it possible to add a unit test for 
this?

> GpuResourceAllocator.getReleasingGpus calculates number of GPUs in a wrong way
> --
>
> Key: YARN-9099
> URL: https://issues.apache.org/jira/browse/YARN-9099
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Attachments: YARN-9099.001.patch, YARN-9099.002.patch
>
>
> getReleasingGpus plays an important role in the calculation which happens 
> when GpuAllocator assign GPUs to a container, see: 
> GpuResourceAllocator#internalAssignGpus.
> If multiple GPUs are assigned to the same container, getReleasingGpus will 
> return an invalid number.
> The iterator goes over on mappings of (GPU device, container ID) and it 
> retrieves the container by its ID the number of times the container ID is 
> mapped to any device.
> Then for every container, the resource value for the GPU resource is added to 
> a running sum.
> Obviously, if a container is mapped to 2 or more devices, then the 
> container's GPU resource counter is added to the running sum as many times as 
> the number of GPU devices the container has.
> Example: 
> Let's suppose {{usedDevices}} contains these mappings: 
> - (GPU1, container1)
> - (GPU2, container1)
> - (GPU3, container2)
> GPU resource value is 2 for container1 and 
> GPU resource value is 1 for container2.
> Then, if container1 is in a running state, getReleasingGpus will return 4 
> instead of 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9092) Create an object for cgroups mount enable and cgroups mount path as they belong together

2019-01-24 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1675#comment-1675
 ] 

Peter Bacsko commented on YARN-9092:


+1 LGTM (non-binding)

> Create an object for cgroups mount enable and cgroups mount path as they 
> belong together
> 
>
> Key: YARN-9092
> URL: https://issues.apache.org/jira/browse/YARN-9092
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9092.001.patch, YARN-9092.002.patch, 
> YARN-9092.003.patch
>
>
> YarnConfiguration.NM_LINUX_CONTAINER_CGROUPS_MOUNT and 
> YarnConfiguration.NM_LINUX_CONTAINER_CGROUPS_MOUNT_PATH are used in 
> conjunction many places in the code, so for the sake of readabilty and 
> simplicity, it is better to wrap the values of these configs to an object and 
> use it instead of having 2 fields in 
> CGroupsHandlerImpl and in CgroupsLCEResourcesHandler as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9234) NPE Exception Occurred on Resourcemanager

2019-01-24 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751089#comment-16751089
 ] 

Sunil Govindan commented on YARN-9234:
--

YARN-8193 should land on branch-2.9 to fix this issue.

[~leftnoteasy], do you know why YARN-8193 is not landed in branch-2.9. 

> NPE Exception Occurred on Resourcemanager
> -
>
> Key: YARN-9234
> URL: https://issues.apache.org/jira/browse/YARN-9234
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.9.0
>Reporter: Amithsha
>Priority: Major
>
> 2019-01-24 14:52:17,893 FATAL event.EventDispatcher (?:?(?)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:814)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:857)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:55)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:868)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1346)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1341)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1430)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1205)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1067)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1472)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:151)
>  at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown

2019-01-24 Thread JIRA
Antal Bálint Steinbach created YARN-9235:


 Summary: If linux container executor is not set for a GPU cluster 
GpuResourceHandlerImpl is not initialized and NPE is thrown
 Key: YARN-9235
 URL: https://issues.apache.org/jira/browse/YARN-9235
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.1.0, 3.0.0
Reporter: Antal Bálint Steinbach
Assignee: Antal Bálint Steinbach


If GPU plugin is set for the NodeManger it is possible to run jobs with GPU.

But if Linux container Executor is not configured a NPE is thrown when calling 

GpuResourcePlugin.getNMResourceInfo.

Also, there are no warns in the log if GPU is misconfigured like this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9234) NPE Exception Occurred on Resourcemanager

2019-01-24 Thread Amithsha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751076#comment-16751076
 ] 

Amithsha commented on YARN-9234:


Added the stacktrace of the exceptions.

Also no other error on RM log.

> NPE Exception Occurred on Resourcemanager
> -
>
> Key: YARN-9234
> URL: https://issues.apache.org/jira/browse/YARN-9234
> Project: Hadoop YARN
>  Issue Type: New Feature
>Affects Versions: 2.9.0
>Reporter: Amithsha
>Priority: Major
>
> 2019-01-24 14:52:17,893 FATAL event.EventDispatcher (?:?(?)) - Error in 
> handling event type NODE_UPDATE to the Event Dispatcher
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:814)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:857)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:55)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:868)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1121)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1346)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1341)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1430)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1205)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1067)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1472)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:151)
>  at 
> org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9234) NPE Exception Occurred on Resourcemanager

2019-01-24 Thread Amithsha (JIRA)
Amithsha created YARN-9234:
--

 Summary: NPE Exception Occurred on Resourcemanager
 Key: YARN-9234
 URL: https://issues.apache.org/jira/browse/YARN-9234
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.9.0
Reporter: Amithsha


2019-01-24 14:52:17,893 FATAL event.EventDispatcher (?:?(?)) - Error in 
handling event type NODE_UPDATE to the Event Dispatcher
java.lang.NullPointerException
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:814)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:857)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:55)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:868)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:1121)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:734)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:558)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1346)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1341)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1430)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1205)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.nodeUpdate(CapacityScheduler.java:1067)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1472)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:151)
 at 
org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9227) DistributedShell RelativePath is not removed at end

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751022#comment-16751022
 ] 

Hadoop QA commented on YARN-9227:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 22s{color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-applications-distributedshell
 generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell:
 The patch generated 4 new + 205 unchanged - 0 fixed = 209 total (was 205) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
46s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m  4s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell
 |
|  |  Possible null pointer dereference of appMaster in 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(String[])
 on exception path  Dereferenced at ApplicationMaster.java:appMaster in 
org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(String[])
 on exception path  Dereferenced at ApplicationMaster.java:[line 405] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9227 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956116/0001-YARN-9227.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  

[jira] [Commented] (YARN-9086) [CSI] Run csi-driver-adaptor as aux service

2019-01-24 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751012#comment-16751012
 ] 

Hadoop QA commented on YARN-9086:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 30s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 213 unchanged - 0 fixed = 216 total (was 213) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-csi 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
42s{color} | {color:green} hadoop-yarn-csi in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9086 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956110/YARN-9086.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux a07674d4bdb3 

[jira] [Comment Edited] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751008#comment-16751008
 ] 

Zhankun Tang edited comment on YARN-9060 at 1/24/19 10:56 AM:
--

[~cheersyang] , [~sunilg] , The patch consists below key things:

1. The native isolation module. It has a different c-e.cfg with GPU/FPGA module 
due to the bug they have. See 
[above|https://issues.apache.org/jira/browse/YARN-9060?focusedCommentId=16707359=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16707359]
 comments for details explanation. The key change of the config is we use 
"devices.denied-numbers" instead of "devices.allowed-number".
[devices] 
  module.enabled=true  device.allowed-numbers=8:32 # this will be removed.
  devices.denied-numbers=8:48,8:16 #comma separated major:minor. Empty means 
allow default devices reported by device plugin.
And the interface of this c-e module for the Java layer to invoke is:
c-e --module-devices \
  --excluded_devices b-8:32-rwm,c-195:0 \
  --allowed_devices 8:16,8:48,195:1 \
  --container_id container_x_y
2. The DeviceResourceDockerRuntimePluginImpl.java which bridge the 
DockerLinuxContainerRuntime and the vendor device plugin. The vendor device 
plugin's onDeviceAllocated generated DeviceRuntimeSpec will be used in this 
class. The spec will be translated to internal YARN Docker volume or run 
command.

3. A sample Nvidia GPU plugin which uses our new DevicePlugin interface.

I did the End-To-End test on an AWS EC2 instance with 1 GPU card. Please help 
to review. Thanks!


was (Author: tangzhankun):
[~cheersyang] , [~sunilg] , The patch consists below key things:

1. The native isolation module. It has a different c-e.cfg with GPU/FPGA module 
due to the bug they have. See above comments for details explanation. The key 
change of the config is we use "devices.denied-numbers" instead of 
"devices.allowed-number".
[devices] 
  module.enabled=true  device.allowed-numbers=8:32 # this will be removed.
  devices.denied-numbers=8:48,8:16 #comma separated major:minor. Empty means 
allow default devices reported by device plugin.
And the interface of this c-e module for the Java layer to invoke is:
c-e --module-devices \
  --excluded_devices b-8:32-rwm,c-195:0 \
  --allowed_devices 8:16,8:48,195:1 \
  --container_id container_x_y
2. The DeviceResourceDockerRuntimePluginImpl.java which bridge the 
DockerLinuxContainerRuntime and the vendor device plugin. The vendor device 
plugin's onDeviceAllocated generated DeviceRuntimeSpec will be used in this 
class. The spec will be translated to internal YARN Docker volume or run 
command.

3. A sample Nvidia GPU plugin which uses our new DevicePlugin interface.

I did the End-To-End test on an AWS EC2 instance with 1 GPU card. Please help 
to review. Thanks!

> [YARN-8851] Phase 1 - Support device isolation in native container-executor
> ---
>
> Key: YARN-9060
> URL: https://issues.apache.org/jira/browse/YARN-9060
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, 
> YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, 
> YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, 
> YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, 
> YARN-9060-trunk.009.patch
>
>
> Due to the cgroups v1 implementation policy in linux kernel, we cannot update 
> the value of the device cgroups controller unless we have the root permission 
> ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]).
>  So we need to support this in container-executor for Java layer to invoke.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Zhankun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751008#comment-16751008
 ] 

Zhankun Tang commented on YARN-9060:


[~cheersyang] , [~sunilg] , The patch consists below key things:

1. The native isolation module. It has a different c-e.cfg with GPU/FPGA module 
due to the bug they have. See above comments for details explanation. The key 
change of the config is we use "devices.denied-numbers" instead of 
"devices.allowed-number".
[devices] 
  module.enabled=true  device.allowed-numbers=8:32 # this will be removed.
  devices.denied-numbers=8:48,8:16 #comma separated major:minor. Empty means 
allow default devices reported by device plugin.
And the interface of this c-e module for the Java layer to invoke is:
c-e --module-devices \
  --excluded_devices b-8:32-rwm,c-195:0 \
  --allowed_devices 8:16,8:48,195:1 \
  --container_id container_x_y
2. The DeviceResourceDockerRuntimePluginImpl.java which bridge the 
DockerLinuxContainerRuntime and the vendor device plugin. The vendor device 
plugin's onDeviceAllocated generated DeviceRuntimeSpec will be used in this 
class. The spec will be translated to internal YARN Docker volume or run 
command.

3. A sample Nvidia GPU plugin which uses our new DevicePlugin interface.

I did the End-To-End test on an AWS EC2 instance with 1 GPU card. Please help 
to review. Thanks!

> [YARN-8851] Phase 1 - Support device isolation in native container-executor
> ---
>
> Key: YARN-9060
> URL: https://issues.apache.org/jira/browse/YARN-9060
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, 
> YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, 
> YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, 
> YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, 
> YARN-9060-trunk.009.patch
>
>
> Due to the cgroups v1 implementation policy in linux kernel, we cannot update 
> the value of the device cgroups controller unless we have the root permission 
> ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]).
>  So we need to support this in container-executor for Java layer to invoke.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8498) Yarn NodeManager OOM Listener Fails Compilation on Ubuntu 18.04

2019-01-24 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751004#comment-16751004
 ] 

Sunil Govindan commented on YARN-8498:
--

This should be fine. 

Could we get this in?

> Yarn NodeManager OOM Listener Fails Compilation on Ubuntu 18.04
> ---
>
> Key: YARN-8498
> URL: https://issues.apache.org/jira/browse/YARN-8498
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jack Bearden
>Assignee: Ayush Saxena
>Priority: Blocker
> Attachments: YARN-8498-02.patch, YARN-8948-01.patch
>
>
> While building this project, I ran into a few compilation errors here. The 
> first one was in this file:
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener_main.c
> At the very end, during the compilation of the OOM test, it fails again:
>  
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:256:7:
>  error: ‘__WAIT_STATUS’ was not declared in this scope
>  __WAIT_STATUS mem_hog_status = {};
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:257:30:
>  error: ‘mem_hog_status’ was not declared in this scope
>  __pid_t exited0 = wait(mem_hog_status);
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:275:21:
>  error: expected ‘;’ before ‘oom_listener_status’
>  __WAIT_STATUS oom_listener_status = {};
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc:276:30:
>  error: ‘oom_listener_status’ was not declared in this scope
>  __pid_t exited1 = wait(oom_listener_status);
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9227) DistributedShell RelativePath is not removed at end

2019-01-24 Thread Feng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750991#comment-16750991
 ] 

Feng Yuan commented on YARN-9227:
-

Thank [~Prabhu Joseph] raise this issue, i can get reproduction.
If should fix this leak [~jlowe].

> DistributedShell RelativePath is not removed at end
> ---
>
> Key: YARN-9227
> URL: https://issues.apache.org/jira/browse/YARN-9227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-9227.patch
>
>
> DistributedShell Job does not remove the relative path which contains jars 
> and localized files.
> {code}
> [ambari-qa@ash hadoop-yarn]$ hadoop fs -ls 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017
> Found 2 items
> -rw-r--r--   3 ambari-qa hdfs  46636 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/AppMaster.jar
> -rwx--x---   3 ambari-qa hdfs  4 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/shellCommands
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-9227) DistributedShell RelativePath is not removed at end

2019-01-24 Thread Feng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Yuan updated YARN-9227:

Comment: was deleted

(was: Thank [~Prabhu Joseph] raise this issue, i can get reproduction.
If should fix this leak [~jlowe].)

> DistributedShell RelativePath is not removed at end
> ---
>
> Key: YARN-9227
> URL: https://issues.apache.org/jira/browse/YARN-9227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.1.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: 0001-YARN-9227.patch
>
>
> DistributedShell Job does not remove the relative path which contains jars 
> and localized files.
> {code}
> [ambari-qa@ash hadoop-yarn]$ hadoop fs -ls 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017
> Found 2 items
> -rw-r--r--   3 ambari-qa hdfs  46636 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/AppMaster.jar
> -rwx--x---   3 ambari-qa hdfs  4 2019-01-23 13:37 
> /user/ambari-qa/DistributedShell/application_1542665708563_0017/shellCommands
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9060) [YARN-8851] Phase 1 - Support device isolation in native container-executor

2019-01-24 Thread Zhankun Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-9060:
---
Attachment: YARN-9060-trunk.009.patch

> [YARN-8851] Phase 1 - Support device isolation in native container-executor
> ---
>
> Key: YARN-9060
> URL: https://issues.apache.org/jira/browse/YARN-9060
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Zhankun Tang
>Assignee: Zhankun Tang
>Priority: Major
> Attachments: YARN-9060-trunk.001.patch, YARN-9060-trunk.002.patch, 
> YARN-9060-trunk.003.patch, YARN-9060-trunk.004.patch, 
> YARN-9060-trunk.005.patch, YARN-9060-trunk.006.patch, 
> YARN-9060-trunk.007.patch, YARN-9060-trunk.008.patch, 
> YARN-9060-trunk.009.patch
>
>
> Due to the cgroups v1 implementation policy in linux kernel, we cannot update 
> the value of the device cgroups controller unless we have the root permission 
> ([here|https://github.com/torvalds/linux/blob/6f0d349d922ba44e4348a17a78ea51b7135965b1/security/device_cgroup.c#L604]).
>  So we need to support this in container-executor for Java layer to invoke.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-01-24 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16750977#comment-16750977
 ] 

Bilwa S T commented on YARN-9233:
-

cc [~bibinchundatt] [~cheersyang] 

> RM may report allocated container which is killed (but not acquired by AM ) 
> to AM which can cause spark AM confused
> ---
>
> Key: YARN-9233
> URL: https://issues.apache.org/jira/browse/YARN-9233
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> After the RM kills an allocated (Allocated state) container for various 
> reasons, it will go through the state transition process to the FINISHED 
> state just like other state containers. Currently RM doesn't consider if 
> container is acquired by the AM. Hence All the containers transitioned to 
> FINISH state are added to justFinishedContainers list. Therefore the 
> container that is not obtained by the AM and is killed by the rm will also 
> return through the AM heartbeat. So AM re-applies for more resources than 
> needed which would eventually cause number of containers to exceed the 
> maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9233) RM may report allocated container which is killed (but not acquired by AM ) to AM which can cause spark AM confused

2019-01-24 Thread Bilwa S T (JIRA)
Bilwa S T created YARN-9233:
---

 Summary: RM may report allocated container which is killed (but 
not acquired by AM ) to AM which can cause spark AM confused
 Key: YARN-9233
 URL: https://issues.apache.org/jira/browse/YARN-9233
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bilwa S T


After the RM kills an allocated (Allocated state) container for various 
reasons, it will go through the state transition process to the FINISHED state 
just like other state containers. Currently RM doesn't consider if container is 
acquired by the AM. Hence All the containers transitioned to FINISH state are 
added to justFinishedContainers list. Therefore the container that is not 
obtained by the AM and is killed by the rm will also return through the AM 
heartbeat. So AM re-applies for more resources than needed which would 
eventually cause number of containers to exceed the maximum limit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >