[jira] [Updated] (YARN-9838) Fix resource inconsistency for queues when moving app with reserved container to another queue

2019-11-21 Thread Tao Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-9838:
---
Summary: Fix resource inconsistency for queues when moving app with 
reserved container to another queue  (was: Using the CapacityScheduler,Apply 
"movetoqueue" on the application which CS reserved containers for,will cause 
"Num Container" and "Used Resource" in ResourceUsage metrics error )

> Fix resource inconsistency for queues when moving app with reserved container 
> to another queue
> --
>
> Key: YARN-9838
> URL: https://issues.apache.org/jira/browse/YARN-9838
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.7.3
>Reporter: jiulongzhu
>Assignee: jiulongzhu
>Priority: Critical
>  Labels: patch
> Attachments: RM_UI_metric_negative.png, RM_UI_metric_positive.png, 
> YARN-9838.0001.patch, YARN-9838.0002.patch
>
>
>       In some clusters of ours, we are seeing "Used Resource","Used 
> Capacity","Absolute Used Capacity" and "Num Container" is positive or 
> negative when the queue is absolutely idle(no RUNNING, no NEW apps...).In 
> extreme cases, apps couldn't be submitted to the queue that is actually idle 
> but the "Used Resource" is far more than zero, just like "Container Leak".
>       Firstly,I found that "Used Resource","Used Capacity" and "Absolute Used 
> Capacity" use the "Used" value of ResourceUsage kept by AbstractCSQueue, and 
> "Num Container" use the "numContainer" value kept by LeafQueue.And 
> AbstractCSQueue#allocateResource and AbstractCSQueue#releaseResource will 
> change the state value of "numContainer" and "Used". Secondly, by comparing 
> the values numContainer and ResourceUsageByLabel and QueueMetrics 
> changed(#allocateContainer and #releaseContainer) logic of applications with 
> and without "movetoqueue",i found that moving the reservedContainers didn't 
> modify the "numContainer" value in AbstractCSQueue and "used" value in 
> ResourceUsage when the application was moved from a queue to another queue.
>         The metric values changed logic of reservedContainers are allocated, 
> and moved from $FROM queue to $TO queue, and released.The degree of increase 
> and decrease is not conservative, the Resource allocated from $FROM queue and 
> release to $TO queue.
> ||move reversedContainer||allocate||movetoqueue||release||
> |numContainer|increase in $FROM queue|{color:#FF}$FROM queue stay the 
> same,$TO queue stay the same{color}|decrease  in $TO queue|
> |ResourceUsageByLabel(USED)|increase in $FROM queue|{color:#FF}$FROM 
> queue stay the same,$TO queue stay the same{color}|decrease  in $TO queue |
> |QueueMetrics|increase in $FROM queue|decrease in $FROM queue, increase in 
> $TO queue|decrease  in $TO queue|
>       The metric values changed logic of allocatedContainer(allocated, 
> acquired, running) are allocated, and movetoqueue, and released are 
> absolutely conservative.
>    



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7589) TestPBImplRecords fails with NullPointerException

2019-11-21 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979582#comment-16979582
 ] 

Jonathan Hung edited comment on YARN-7589 at 11/21/19 11:40 PM:


Pushed to branch-2 / branch-2.10


was (Author: jhung):
Pushed to branch-2

> TestPBImplRecords fails with NullPointerException
> -
>
> Key: YARN-7589
> URL: https://issues.apache.org/jira/browse/YARN-7589
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.0, 3.0.1
>Reporter: Jason Darrell Lowe
>Assignee: Daniel Templeton
>Priority: Major
> Fix For: 3.0.0, 3.1.0, 3.0.1, 2.10.1, 2.11.0
>
> Attachments: YARN-7589.001.patch
>
>
> TestPBImplRecords is failing consistently in trunk:
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.413 
> s <<< FAILURE! - in org.apache.hadoop.yarn.api.TestPBImplRecords
> [ERROR] org.apache.hadoop.yarn.api.TestPBImplRecords  Time elapsed: 0.413 s  
> <<< ERROR!
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.yarn.api.BasePBImplRecordsTest.generateByNewInstance(BasePBImplRecordsTest.java:151)
>   at 
> org.apache.hadoop.yarn.api.TestPBImplRecords.setup(TestPBImplRecords.java:371)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.util.resource.ResourceUtils.createResourceTypesArray(ResourceUtils.java:644)
>   at 
> org.apache.hadoop.yarn.api.records.Resource.newInstance(Resource.java:105)
>   ... 23 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7589) TestPBImplRecords fails with NullPointerException

2019-11-21 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7589:

Fix Version/s: 2.10.1

> TestPBImplRecords fails with NullPointerException
> -
>
> Key: YARN-7589
> URL: https://issues.apache.org/jira/browse/YARN-7589
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.0, 3.0.1
>Reporter: Jason Darrell Lowe
>Assignee: Daniel Templeton
>Priority: Major
> Fix For: 3.0.0, 3.1.0, 3.0.1, 2.10.1, 2.11.0
>
> Attachments: YARN-7589.001.patch
>
>
> TestPBImplRecords is failing consistently in trunk:
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.413 
> s <<< FAILURE! - in org.apache.hadoop.yarn.api.TestPBImplRecords
> [ERROR] org.apache.hadoop.yarn.api.TestPBImplRecords  Time elapsed: 0.413 s  
> <<< ERROR!
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.yarn.api.BasePBImplRecordsTest.generateByNewInstance(BasePBImplRecordsTest.java:151)
>   at 
> org.apache.hadoop.yarn.api.TestPBImplRecords.setup(TestPBImplRecords.java:371)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.util.resource.ResourceUtils.createResourceTypesArray(ResourceUtils.java:644)
>   at 
> org.apache.hadoop.yarn.api.records.Resource.newInstance(Resource.java:105)
>   ... 23 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8202) DefaultAMSProcessor should properly check units of requested custom resource types against minimum/maximum allocation

2019-11-21 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-8202:
-
Fix Version/s: 2.10.1

> DefaultAMSProcessor should properly check units of requested custom resource 
> types against minimum/maximum allocation
> -
>
> Key: YARN-8202
> URL: https://issues.apache.org/jira/browse/YARN-8202
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Blocker
> Fix For: 3.1.1, 2.10.1, 2.11.0
>
> Attachments: YARN-8202-001.patch, YARN-8202-002.patch, 
> YARN-8202-003.patch, YARN-8202-004.patch, YARN-8202-005.patch, 
> YARN-8202-006.patch, YARN-8202-007.patch, YARN-8202-008.patch, 
> YARN-8202-009.patch, YARN-8202-010.patch
>
>
>  
> When I execute a pi job with arguments: 
> {code:java}
> -Dmapreduce.map.resource.memory-mb=200 
> -Dmapreduce.map.resource.resource1=500M 1 1000{code}
> and I have one node with 5GB of resource1, I get the following exception on 
> every second and the job hangs:
> {code:java}
> 2018-04-24 08:42:03,694 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 20 on 8030, call Call#386 Retry#0 
> org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from 
> 172.31.119.172:58138
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, requested resource type=[resource1] < 0 or greater than 
> maximum allowed allocation. Requested resource= resource1: 500M>, maximum allowed allocation= resource1: 5G>, please note that maximum allowed allocation is calculated by 
> scheduler based on maximum resource of registered NodeManagers, which might 
> be less than configured maximum allocation= resource1: 9223372036854775807G>
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:286)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:242)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:258)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:249)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:230)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:433)
>         at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
>         at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
> {code}
> *This is because 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils#validateResourceRequest
>  does not take resource units into account.*
>  
> However, if I start a job with arguments: 
> {code:java}
> -Dmapreduce.map.resource.memory-mb=200 -Dmapreduce.map.resource.resource1=1G 
> 1 1000{code}
> and I still have 5GB of resource1 on one node then the job runs successfully.
>  
> I also tried a third job run, when I request 1GB of resource1 and I have no 
> nodes with any amount of resource1, then I restart the node with 5GBs of 
> resource1, the job ultimately completes, but just after the node with enough 
> resources registered in RM, which is the desired behaviour.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (YARN-7541) Node updates don't update the maximum cluster capability for resources other than CPU and memory

2019-11-21 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-7541:
-
Fix Version/s: 2.10.1

> Node updates don't update the maximum cluster capability for resources other 
> than CPU and memory
> 
>
> Key: YARN-7541
> URL: https://issues.apache.org/jira/browse/YARN-7541
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.0.0-beta1, 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
> Fix For: 3.0.0, 3.1.0, 2.10.1, 2.11.0
>
> Attachments: YARN-7541.001.patch, YARN-7541.002.patch, 
> YARN-7541.003.patch, YARN-7541.004.patch, YARN-7541.005.patch, 
> YARN-7541.006.patch, YARN-7541.branch-3.0.001.patch
>
>
> When I submit an MR job that asks for too much memory or CPU for the map or 
> reduce, the AM will fail because it recognizes that the request is too large. 
>  With any other resources, however, the resource requests will instead be 
> made and remain pending forever.  Looks like we forgot to update the code 
> that tracks the maximum container allocation in {{ClusterNodeTracker}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7739) DefaultAMSProcessor should properly check customized resource types against minimum/maximum allocation

2019-11-21 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-7739:
-
Fix Version/s: 2.10.1

> DefaultAMSProcessor should properly check customized resource types against 
> minimum/maximum allocation
> --
>
> Key: YARN-7739
> URL: https://issues.apache.org/jira/browse/YARN-7739
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Fix For: 3.1.0, 2.10.1, 2.11.0
>
> Attachments: YARN-7339.002.patch, YARN-7739.001.patch
>
>
> Currently, YARN RM reject requested resource if memory or vcores are less 
> than 0 or greater than maximum allocation. We should run the check for 
> customized resource types as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7589) TestPBImplRecords fails with NullPointerException

2019-11-21 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979582#comment-16979582
 ] 

Jonathan Hung commented on YARN-7589:
-

Pushed to branch-2

> TestPBImplRecords fails with NullPointerException
> -
>
> Key: YARN-7589
> URL: https://issues.apache.org/jira/browse/YARN-7589
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.0, 3.0.1
>Reporter: Jason Darrell Lowe
>Assignee: Daniel Templeton
>Priority: Major
> Fix For: 3.0.0, 3.1.0, 3.0.1, 2.11.0
>
> Attachments: YARN-7589.001.patch
>
>
> TestPBImplRecords is failing consistently in trunk:
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.413 
> s <<< FAILURE! - in org.apache.hadoop.yarn.api.TestPBImplRecords
> [ERROR] org.apache.hadoop.yarn.api.TestPBImplRecords  Time elapsed: 0.413 s  
> <<< ERROR!
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.yarn.api.BasePBImplRecordsTest.generateByNewInstance(BasePBImplRecordsTest.java:151)
>   at 
> org.apache.hadoop.yarn.api.TestPBImplRecords.setup(TestPBImplRecords.java:371)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.util.resource.ResourceUtils.createResourceTypesArray(ResourceUtils.java:644)
>   at 
> org.apache.hadoop.yarn.api.records.Resource.newInstance(Resource.java:105)
>   ... 23 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7589) TestPBImplRecords fails with NullPointerException

2019-11-21 Thread Jonathan Hung (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-7589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7589:

Fix Version/s: 2.11.0

> TestPBImplRecords fails with NullPointerException
> -
>
> Key: YARN-7589
> URL: https://issues.apache.org/jira/browse/YARN-7589
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.0, 3.0.1
>Reporter: Jason Darrell Lowe
>Assignee: Daniel Templeton
>Priority: Major
> Fix For: 3.0.0, 3.1.0, 3.0.1, 2.11.0
>
> Attachments: YARN-7589.001.patch
>
>
> TestPBImplRecords is failing consistently in trunk:
> {noformat}
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.413 
> s <<< FAILURE! - in org.apache.hadoop.yarn.api.TestPBImplRecords
> [ERROR] org.apache.hadoop.yarn.api.TestPBImplRecords  Time elapsed: 0.413 s  
> <<< ERROR!
> java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.yarn.api.BasePBImplRecordsTest.generateByNewInstance(BasePBImplRecordsTest.java:151)
>   at 
> org.apache.hadoop.yarn.api.TestPBImplRecords.setup(TestPBImplRecords.java:371)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:160)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:373)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:334)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:119)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:407)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.util.resource.ResourceUtils.createResourceTypesArray(ResourceUtils.java:644)
>   at 
> org.apache.hadoop.yarn.api.records.Resource.newInstance(Resource.java:105)
>   ... 23 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8842) Expose metrics for custom resource types in QueueMetrics

2019-11-21 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979576#comment-16979576
 ] 

Jonathan Hung commented on YARN-8842:
-

Thanks [~epayne], no objections from me!

> Expose metrics for custom resource types in QueueMetrics
> 
>
> Key: YARN-8842
> URL: https://issues.apache.org/jira/browse/YARN-8842
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4, 2.11.0
>
> Attachments: YARN-8842-branch-2.001.patch, 
> YARN-8842-branch-2.002.patch, YARN-8842-branch-2.003.patch, 
> YARN-8842.001.patch, YARN-8842.002.patch, YARN-8842.003.patch, 
> YARN-8842.004.patch, YARN-8842.005.patch, YARN-8842.006.patch, 
> YARN-8842.007.patch, YARN-8842.008.patch, YARN-8842.009.patch, 
> YARN-8842.010.patch, YARN-8842.011.patch, YARN-8842.012.patch
>
>
> This is the 2nd dependent jira of YARN-8059.
> As updating the metrics is an independent step from handling preemption, this 
> jira only deals with the queue metrics update of custom resources.
> The following metrics should be updated: 
> * allocated resources
> * available resources
> * pending resources
> * reserved resources
> * aggregate seconds preempted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8842) Expose metrics for custom resource types in QueueMetrics

2019-11-21 Thread Eric Payne (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-8842:
-
Fix Version/s: 2.11.0

> Expose metrics for custom resource types in QueueMetrics
> 
>
> Key: YARN-8842
> URL: https://issues.apache.org/jira/browse/YARN-8842
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4, 2.11.0
>
> Attachments: YARN-8842-branch-2.001.patch, 
> YARN-8842-branch-2.002.patch, YARN-8842-branch-2.003.patch, 
> YARN-8842.001.patch, YARN-8842.002.patch, YARN-8842.003.patch, 
> YARN-8842.004.patch, YARN-8842.005.patch, YARN-8842.006.patch, 
> YARN-8842.007.patch, YARN-8842.008.patch, YARN-8842.009.patch, 
> YARN-8842.010.patch, YARN-8842.011.patch, YARN-8842.012.patch
>
>
> This is the 2nd dependent jira of YARN-8059.
> As updating the metrics is an independent step from handling preemption, this 
> jira only deals with the queue metrics update of custom resources.
> The following metrics should be updated: 
> * allocated resources
> * available resources
> * pending resources
> * reserved resources
> * aggregate seconds preempted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8842) Expose metrics for custom resource types in QueueMetrics

2019-11-21 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979573#comment-16979573
 ] 

Eric Payne commented on YARN-8842:
--

Thanks [~jhung] and [~snemeth] for all of your work on this JIRA. I committed 
YARN-8842-branch-2.003.patch to branch-2.

I would also like to port this back to branch-2.10. [~jhung], if you don't have 
any objections, I will cherry-pick YARN-7739, YARN-7541, YARN-8202, which are 
prerequi8sites.

> Expose metrics for custom resource types in QueueMetrics
> 
>
> Key: YARN-8842
> URL: https://issues.apache.org/jira/browse/YARN-8842
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-8842-branch-2.001.patch, 
> YARN-8842-branch-2.002.patch, YARN-8842-branch-2.003.patch, 
> YARN-8842.001.patch, YARN-8842.002.patch, YARN-8842.003.patch, 
> YARN-8842.004.patch, YARN-8842.005.patch, YARN-8842.006.patch, 
> YARN-8842.007.patch, YARN-8842.008.patch, YARN-8842.009.patch, 
> YARN-8842.010.patch, YARN-8842.011.patch, YARN-8842.012.patch
>
>
> This is the 2nd dependent jira of YARN-8059.
> As updating the metrics is an independent step from handling preemption, this 
> jira only deals with the queue metrics update of custom resources.
> The following metrics should be updated: 
> * allocated resources
> * available resources
> * pending resources
> * reserved resources
> * aggregate seconds preempted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-11-21 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979553#comment-16979553
 ] 

Hadoop QA commented on YARN-6492:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 34s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 56 new + 176 unchanged - 4 fixed = 232 total (was 180) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 11 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
19s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 11s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
44s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Dead store to metrics in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.getPartitionQueueMetrics(String)
  At 
QueueMetrics.java:org.apache.hadoop.yarn.server.resourcemanager.scheduler.QueueMetrics.getPartitionQueueMetrics(String)
  At QueueMetrics.java:[line 313] |
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.TestQueueMetricsForCustomResources 
|
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-6492 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986440/YARN-6492.007.WIP.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs 

[jira] [Commented] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource

2019-11-21 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979534#comment-16979534
 ] 

Hadoop QA commented on YARN-9444:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
50s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
35s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
31s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 31s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  8s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  0s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
44s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
38s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9444 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986437/YARN-9444.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 276cad7ac9d6 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 98d249d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| compile | 

[jira] [Commented] (YARN-8842) Expose metrics for custom resource types in QueueMetrics

2019-11-21 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979457#comment-16979457
 ] 

Eric Payne commented on YARN-8842:
--

Thanks for the patch, [~jhung].

+1

I will run some more tests and commit to branch-2 and branch-2.10 if no 
objections.

> Expose metrics for custom resource types in QueueMetrics
> 
>
> Key: YARN-8842
> URL: https://issues.apache.org/jira/browse/YARN-8842
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-8842-branch-2.001.patch, 
> YARN-8842-branch-2.002.patch, YARN-8842-branch-2.003.patch, 
> YARN-8842.001.patch, YARN-8842.002.patch, YARN-8842.003.patch, 
> YARN-8842.004.patch, YARN-8842.005.patch, YARN-8842.006.patch, 
> YARN-8842.007.patch, YARN-8842.008.patch, YARN-8842.009.patch, 
> YARN-8842.010.patch, YARN-8842.011.patch, YARN-8842.012.patch
>
>
> This is the 2nd dependent jira of YARN-8059.
> As updating the metrics is an independent step from handling preemption, this 
> jira only deals with the queue metrics update of custom resources.
> The following metrics should be updated: 
> * allocated resources
> * available resources
> * pending resources
> * reserved resources
> * aggregate seconds preempted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource

2019-11-21 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979445#comment-16979445
 ] 

Hadoop QA commented on YARN-9444:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 9s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
33s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
29s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 29s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  8s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
59s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9444 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986437/YARN-9444.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7dc84359cb8e 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 98d249d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| compile | 

[jira] [Updated] (YARN-6492) Generate queue metrics for each partition

2019-11-21 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-6492:
---
Attachment: YARN-6492.007.WIP.patch

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-11-21 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979418#comment-16979418
 ] 

Manikandan R commented on YARN-6492:


{quote}in pSourceName, how come we split partition by Q_SPLITTER? I think we 
don't need to do any splitting here (there should only be one partition){\quote}
Made changes.
{quote}Do we need a separate getPartitionMetrics? Can we track a partition's 
metrics via that partition + root queue?{\quote}
Tried to incorporate getPartitionMetrics functionality inside 
getPartitionQueueMetrics, then ended up in having many if-else blocks and found 
that it is not clean and elegant. At the same time, having single method can 
reduce changes on the collar side. Had a different approach to have 
getPartitionQueueMetrics inside PartitionQueueMetrics class like getUserMetrics 
which can take care of the same functionality but in much better organised 
fashion. So based on the metrics object, appropriate method would be called at 
runtime as methods are overridden. Found this later approach better when 
compared to earlier approach. Have incorporated this change as well in 
.007.WIP.patch. Thoughts?
{quote}For setAvailableResourcesToUser - how come we add this bit?{\quote}
In addition to earlier comment, Yes, it is not correct behaviour and same has 
been addressed in YARN-9767 WIP patch.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> YARN-6492.007.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource

2019-11-21 Thread Gergely Pollak (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979404#comment-16979404
 ] 

Gergely Pollak commented on YARN-9444:
--

Thank you [~snemeth] for the feedback, I've changed the testcase according to 
your suggestions, however the hamcrest-library wasn't available in the project, 
and I didn't want to add it just for this case, also I don't think it would 
have been helpful, since the elements of the returning list are not simple 
Strings, but ResourceInformation objects, so it would have been quite a hassle 
to create a ResourceInformation list manually just to use the 
containsInAnyOrder method, I've just implemented this check with a regular loop 
and a hashset.

> YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize 
> yarn.io/gpu as a valid resource
> --
>
> Key: YARN-9444
> URL: https://issues.apache.org/jira/browse/YARN-9444
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Minor
> Attachments: YARN-9444.001.patch, YARN-9444.002.patch
>
>
> The original issue was the jobclient test did not send the requested resource 
> type, when specified in the command line eg:
> {code:java}
> hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep 
> -Dmapreduce.reduce.resource.yarn.io/gpu=1  -m 10 -r 1 -mt 9
> {code}
> After some investigation, it turned out it only affects resource types with 
> name containing '.' characters. And the root cause is regexp from the 
> getRequestedResourcesFromConfig method.
> {code:java}
> "^" + Pattern.quote(prefix) + "[^.]+$"
> {code}
> This regexp explicitly forbids any dots in the resource type name, which is 
> inconsistent with the default resource type for gpu and fpga, which are 
> yarn.io/gpu and yarn.io/fpga respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9444) YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize yarn.io/gpu as a valid resource

2019-11-21 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak updated YARN-9444:
-
Attachment: YARN-9444.002.patch

> YARN API ResourceUtils's getRequestedResourcesFromConfig doesn't recognize 
> yarn.io/gpu as a valid resource
> --
>
> Key: YARN-9444
> URL: https://issues.apache.org/jira/browse/YARN-9444
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Minor
> Attachments: YARN-9444.001.patch, YARN-9444.002.patch
>
>
> The original issue was the jobclient test did not send the requested resource 
> type, when specified in the command line eg:
> {code:java}
> hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep 
> -Dmapreduce.reduce.resource.yarn.io/gpu=1  -m 10 -r 1 -mt 9
> {code}
> After some investigation, it turned out it only affects resource types with 
> name containing '.' characters. And the root cause is regexp from the 
> getRequestedResourcesFromConfig method.
> {code:java}
> "^" + Pattern.quote(prefix) + "[^.]+$"
> {code}
> This regexp explicitly forbids any dots in the resource type name, which is 
> inconsistent with the default resource type for gpu and fpga, which are 
> yarn.io/gpu and yarn.io/fpga respectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9879) Allow multiple leaf queues with the same name in CS

2019-11-21 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak updated YARN-9879:
-
Attachment: DesignDoc_v1.pdf

> Allow multiple leaf queues with the same name in CS
> ---
>
> Key: YARN-9879
> URL: https://issues.apache.org/jira/browse/YARN-9879
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9968) Public Localizer is exiting in NodeManager due to NullPointerException

2019-11-21 Thread Tarun Parimi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979324#comment-16979324
 ] 

Tarun Parimi commented on YARN-9968:


[~snemeth] , Please review this when you get time. 

> Public Localizer is exiting in NodeManager due to NullPointerException
> --
>
> Key: YARN-9968
> URL: https://issues.apache.org/jira/browse/YARN-9968
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
> Attachments: YARN-9968.001.patch
>
>
> The Public Localizer is encountering a NullPointerException and exiting.
> {code:java}
> ERROR localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:run(995)) - Error: Shutting down
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:981)
> INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:run(997)) - Public cache exiting
> {code}
> The NodeManager still keeps on running. Subsequent localization events for 
> containers keep encountering the below error, resulting in failed 
> Localization of all new containers. 
> {code:java}
> ERROR localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:addResource(920)) - Failed to submit rsrc { 
> { hdfs://namespace/raw/user/.staging/job/conf.xml 1572071824603, FILE, null 
> },pending,[(container_e30_1571858463080_48304_01_000134)],12513553420029113,FAILED}
>  for download. Either queue is full or threadpool is shutdown.
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.ExecutorCompletionService$QueueingFuture@55c7fa21 
> rejected from 
> org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@46067edd[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 
> 382286]
> at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
> at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
> at 
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:899)
> {code}
> When this happens, the NodeManager becomes usable only after a restart.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-21 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979322#comment-16979322
 ] 

Szilard Nemeth edited comment on YARN-9052 at 11/21/19 2:32 PM:


Hi [~sunilg]!
You previously commented you will help reviewing this. Now, as [~shuzirra] gave 
a +1, now it's the time. 
Would really appreciate if we could merge this one! Thanks in advance!

cc Wangda Tan Weiwei Yang Rohith Sharma K S


was (Author: snemeth):
Hi [~sunilg]!
You previously commented you will help reviewing this. Now, as [~shuzirra] gave 
a +1, it's the time. 
Would really appreciate if we could merge this one! Thanks in advance!

cc Wangda Tan Weiwei Yang Rohith Sharma K S

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-21 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979322#comment-16979322
 ] 

Szilard Nemeth commented on YARN-9052:
--

Hi [~sunilg]!
You previously commented you will help reviewing this. Now, as [~shuzirra] gave 
a +1, it's the time. 
Would really appreciate if we could merge this one! Thanks in advance!

cc Wangda Tan Weiwei Yang Rohith Sharma K S

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-21 Thread Gergely Pollak (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979319#comment-16979319
 ] 

Gergely Pollak commented on YARN-9052:
--

Hi [~snemeth], thank you for the changes, I like your inner class solution much 
better than my original ,,create new method for PrivilegedExceptionAction'' 
suggestion.

 

LGTM + 1 (Non-binding)

 

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4766) NM should not aggregate logs older than the retention policy

2019-11-21 Thread Hu Ziqian (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979115#comment-16979115
 ] 

Hu Ziqian commented on YARN-4766:
-

hi [~haibochen], should you tell me what's problem this issue try to solve? I 
don't understand why the  APPLICATION_LOG_HANDLING_INITED timestamp can decide 
whether a log should be uploaded? The APPLICATION_LOG_HANDLING_INITED timestamp 
is the time an app starts on a NM. But I think whether the log should upload 
related to the finish time of an app.

In our cluster, we find if an app running longer then 
yarn.log-aggregation.retain-seconds and we restart the NM during the app 
running, the log will not upload.

Our problem may also related to YARN-5094 which set the event's timestamp to 
System.currentTimeMillis.
 
 

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha1
>
> Attachments: yarn4766.001.patch, yarn4766.002.patch, 
> yarn4766.003.patch, yarn4766.004.patch, yarn4766.005.patch, yarn4766.006.patch
>
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]

2019-11-21 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16979065#comment-16979065
 ] 

Peter Bacsko commented on YARN-9899:


[~snemeth] I just tested the patch on a real Hadoop cluster. Seems to be 
working fine.

One thing I wish to change is this:
{noformat}
  private void logAndStdErr(Exception e, String msg) {
LOG.error(msg, e);
System.err.println(msg);
  }
{noformat}

We log the stack trace in every case. As we discussed, errors that we can 
detect and handle don't need it. I'd replace it with this:

{noformat}
  private void logAndStdErr(Exception e, String msg) {
LOG.debug("Stack trace", e);
LOG.error(msg);
System.err.println(msg);
  }
{noformat}

So we only print the trace if debug logging is enabled.

> Migration tool that help to generate CS config based on FS config [Phase 2] 
> 
>
> Key: YARN-9899
> URL: https://issues.apache.org/jira/browse/YARN-9899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9899-001.patch, YARN-9899-002.patch, 
> YARN-9899-003.patch, YARN-9899-004.patch, YARN-9899-005.patch, 
> YARN-9899-006.patch
>
>
> YARN-9699 laid down the groundworks of a converter from FS to CS config.
> During the development of the converter, we came up with the following things 
> to fix. 
> 1. If we don't specify a mandatory option, we have this stacktrace for 
> example:
>  
> {code:java}
> org.apache.commons.cli.MissingOptionException: Missing required option: o
>  at org.apache.commons.cli.Parser.checkRequiredOptions(Parser.java:299)
>  at org.apache.commons.cli.Parser.parse(Parser.java:231)
>  at org.apache.commons.cli.Parser.parse(Parser.java:85)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:100)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1572){code}
>  
> We should provide a more concise and meaningful error message (without 
> stacktrace on the CLI, but we should log the exception with stacktrace to the 
> RM log).
> An explanation of the missing option is also required.
> 2. We may think about how to handle exceptions from commons CLI: 
> MissingArgumentException vs. MissingOptionException
> 3. We need to provide a -h / --help option for the CLI that prints all the 
> possible options / arguments.
> 4. Last but not least: We should move the CLI command to a more reasonable 
> place:
> As YARN-9699 implemented it, the command can be invoked like: 
> {code:java}
> /opt/hadoop/bin/yarn resourcemanager -convert-fs-configuration -y 
> /opt/hadoop/etc/hadoop/yarn-site.xml -f 
> /opt/hadoop/etc/hadoop/fair-scheduler.xml -r 
> ~systest/sample-rules-config.properties -o /tmp/fs-cs-output
> {code}
> This is problematic, as if YARN RM is already running, we need to stop it in 
> order to start the RM again with the conversion switch.
> 5. Add unit test coverage for {{QueuePlacementConverter}}
> 6. Close some feature gaps.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org