[jira] [Commented] (YARN-9440) Improve diagnostics for scheduler and app activities

2019-04-09 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814127#comment-16814127
 ] 

Tao Yang commented on YARN-9440:


Attached v1 patch for review.

Key changes in this patch:
 * basic
 ** Add interface DiagnosticsCollector and its implements 
ResourceDiagnosticsCollector/PlacementConstraintDiagnosticsCollector to collect 
resource and PC diagnostics.
 ** Add overload methods computeAvailableContainers/fitsIn in interface 
ResourceCalculator and its implements 
DefaultResourceCalculator/DominantResourceCalculator, add overload method 
fitsIn in util class Resources, to support collect resource diagnostics.
 ** Add AppRequestAllocationInfo and related updates in 
ActivitiesLogger/ActivityNode/AllocationActivity/AppAllocation/NodeAllocation/ActivitiesInfo/ActivityNodeInfo/AppActivitiesInfo/AppAllocationInfo
 to adjust date structure of activities such as (1) add request level in 
scheduler/app activities, (2) show property fields in app/request/container 
level and so on.
 * for scheduling process
 ** Add static class DiagnosticsCollectorManager and related logic in 
ActivitiesManager to manage collectors and enable them only when necessary, to 
avoid increasing unnecessary overload for scheduler.
 ** Update 
ActivityDiagnosticConstant/LeafQueue/AppSchedulingInfo/RegularContainerAllocator/AppPlacementAllocator/LocalityAppPlacementAllocator/SingleConstraintAppPlacementAllocator/PlacementConstraintsUtil
 to support collecting resource/PC diagnostics and improve the diagnostics in 
scheduling process.
 * UT
 ** Add ActivitiesTestUtils to maintain common check functions for testing 
activities.
 ** Add UT in TestResourceCalculator to test collecting resource diagnostics 
for resource calculator.
 ** Add UT in 
TestRMWebServicesSchedulerActivities/TestRMWebServicesSchedulerActivitiesWithMultiNodesEnabled
 to verify changes for scheduler/app activities.
 ** Update UT in TestActivitiesManager to adapt changes.

> Improve diagnostics for scheduler and app activities
> 
>
> Key: YARN-9440
> URL: https://issues.apache.org/jira/browse/YARN-9440
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9440.001.patch
>
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.cyw6zeehzqmx]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9440) Improve diagnostics for scheduler and app activities

2019-04-09 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-9440:
---
Attachment: YARN-9440.001.patch

> Improve diagnostics for scheduler and app activities
> 
>
> Key: YARN-9440
> URL: https://issues.apache.org/jira/browse/YARN-9440
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9440.001.patch
>
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.cyw6zeehzqmx]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9468) Fix inaccurate documentations in Placement Constraints

2019-04-09 Thread Weiwei Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-9468:
--
Summary: Fix inaccurate documentations in Placement Constraints  (was: 
Document Placement Constraints)

> Fix inaccurate documentations in Placement Constraints
> --
>
> Key: YARN-9468
> URL: https://issues.apache.org/jira/browse/YARN-9468
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: hunshenshi
>Priority: Major
>
> Document Placement Constraints
> *First* 
> {code:java}
> zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
>  * place 5 containers with tag “hbase” with affinity to a rack on which 
> containers with tag “zk” are running (i.e., an “hbase” container 
> should{color:#ff} not{color} be placed at a rack where an “zk” container 
> is running, given that “zk” is the TargetTag of the second constraint);
> The _*not*_ word in brackets should be delete.
>  
> *Second*
> {code:java}
> PlacementSpec => "" | KeyVal;PlacementSpec
> {code}
> The semicolon should be replaced by colon
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814114#comment-16814114
 ] 

Hadoop QA commented on YARN-6929:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
2s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 416 unchanged - 12 fixed = 416 total (was 428) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 35s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
54s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
57s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m  
0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}106m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-6929 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965411/YARN-6929-009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  

[jira] [Created] (YARN-9468) Document Placement Constraints

2019-04-09 Thread hunshenshi (JIRA)
hunshenshi created YARN-9468:


 Summary: Document Placement Constraints
 Key: YARN-9468
 URL: https://issues.apache.org/jira/browse/YARN-9468
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 3.2.0
Reporter: hunshenshi


Document Placement Constraints

*First* 

 
{code:java}
zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
 
 * place 5 containers with tag “hbase” with affinity to a rack on which 
containers with tag “zk” are running (i.e., an “hbase” container 
should{color:#FF} not{color} be placed at a rack where an “zk” container is 
running, given that “zk” is the TargetTag of the second constraint);

 

The _*not*_ word in brackets should be delete.

 

*Second*
{code:java}
PlacementSpec => "" | KeyVal;PlacementSpec
{code}
The semicolon should be replaced by colon

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9468) Document Placement Constraints

2019-04-09 Thread hunshenshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hunshenshi updated YARN-9468:
-
Description: 
Document Placement Constraints

*First* 
{code:java}
zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
 
 * place 5 containers with tag “hbase” with affinity to a rack on which 
containers with tag “zk” are running (i.e., an “hbase” container 
should{color:#ff} not{color} be placed at a rack where an “zk” container is 
running, given that “zk” is the TargetTag of the second constraint);

The _*not*_ word in brackets should be delete.

 

*Second*
{code:java}
PlacementSpec => "" | KeyVal;PlacementSpec
{code}
The semicolon should be replaced by colon

 

  was:
Document Placement Constraints

*First* 

 
{code:java}
zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
 
 * place 5 containers with tag “hbase” with affinity to a rack on which 
containers with tag “zk” are running (i.e., an “hbase” container 
should{color:#FF} not{color} be placed at a rack where an “zk” container is 
running, given that “zk” is the TargetTag of the second constraint);

 

The _*not*_ word in brackets should be delete.

 

*Second*
{code:java}
PlacementSpec => "" | KeyVal;PlacementSpec
{code}
The semicolon should be replaced by colon

 


> Document Placement Constraints
> --
>
> Key: YARN-9468
> URL: https://issues.apache.org/jira/browse/YARN-9468
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: hunshenshi
>Priority: Major
>
> Document Placement Constraints
> *First* 
> {code:java}
> zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
>  
>  * place 5 containers with tag “hbase” with affinity to a rack on which 
> containers with tag “zk” are running (i.e., an “hbase” container 
> should{color:#ff} not{color} be placed at a rack where an “zk” container 
> is running, given that “zk” is the TargetTag of the second constraint);
> The _*not*_ word in brackets should be delete.
>  
> *Second*
> {code:java}
> PlacementSpec => "" | KeyVal;PlacementSpec
> {code}
> The semicolon should be replaced by colon
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9468) Document Placement Constraints

2019-04-09 Thread hunshenshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hunshenshi updated YARN-9468:
-
Description: 
Document Placement Constraints

*First* 
{code:java}
zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
 * place 5 containers with tag “hbase” with affinity to a rack on which 
containers with tag “zk” are running (i.e., an “hbase” container 
should{color:#ff} not{color} be placed at a rack where an “zk” container is 
running, given that “zk” is the TargetTag of the second constraint);

The _*not*_ word in brackets should be delete.

 

*Second*
{code:java}
PlacementSpec => "" | KeyVal;PlacementSpec
{code}
The semicolon should be replaced by colon

 

  was:
Document Placement Constraints

*First* 
{code:java}
zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
 
 * place 5 containers with tag “hbase” with affinity to a rack on which 
containers with tag “zk” are running (i.e., an “hbase” container 
should{color:#ff} not{color} be placed at a rack where an “zk” container is 
running, given that “zk” is the TargetTag of the second constraint);

The _*not*_ word in brackets should be delete.

 

*Second*
{code:java}
PlacementSpec => "" | KeyVal;PlacementSpec
{code}
The semicolon should be replaced by colon

 


> Document Placement Constraints
> --
>
> Key: YARN-9468
> URL: https://issues.apache.org/jira/browse/YARN-9468
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: hunshenshi
>Priority: Major
>
> Document Placement Constraints
> *First* 
> {code:java}
> zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
>  * place 5 containers with tag “hbase” with affinity to a rack on which 
> containers with tag “zk” are running (i.e., an “hbase” container 
> should{color:#ff} not{color} be placed at a rack where an “zk” container 
> is running, given that “zk” is the TargetTag of the second constraint);
> The _*not*_ word in brackets should be delete.
>  
> *Second*
> {code:java}
> PlacementSpec => "" | KeyVal;PlacementSpec
> {code}
> The semicolon should be replaced by colon
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9440) Improve diagnostics for scheduler and app activities

2019-04-09 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-9440:
---
Attachment: (was: YARN-9440.001.patch)

> Improve diagnostics for scheduler and app activities
> 
>
> Key: YARN-9440
> URL: https://issues.apache.org/jira/browse/YARN-9440
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.cyw6zeehzqmx]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2019-04-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-6929:

Attachment: YARN-6929-009.patch

> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-6929-007.patch, YARN-6929-008.patch, 
> YARN-6929-009.patch, YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, 
> YARN-6929.3.patch, YARN-6929.4.patch, YARN-6929.5.patch, YARN-6929.6.patch, 
> YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) 
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.ha

[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2019-04-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814033#comment-16814033
 ] 

Prabhu Joseph commented on YARN-6929:
-

Build Failure is due to YARN-999 and fixed in the YARN-999.addendum.patch. Will 
resubmit the patch.

> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-6929-007.patch, YARN-6929-008.patch, 
> YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, YARN-6929.3.patch, 
> YARN-6929.4.patch, YARN-6929.5.patch, YARN-6929.6.patch, YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) 
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apach

[jira] [Commented] (YARN-9464) Support "Pending Resource" metrics in RM's RESTful API

2019-04-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814031#comment-16814031
 ] 

Prabhu Joseph commented on YARN-9464:
-

[~tangzhankun] Failed test cases are not related - have reported YARN-9467 and 
YARN-9325 to fix the intermittent test case failures. Findbugs warnings also 
not related. Have fixed checkstyle issues.


> Support "Pending Resource" metrics in RM's RESTful API
> --
>
> Key: YARN-9464
> URL: https://issues.apache.org/jira/browse/YARN-9464
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9464-001.patch, YARN-9464-002.patch
>
>
> Knowing only the "available", "used" resource is not enough for YARN 
> management tools like auto-scaler. It would be helpful to diagnose the 
> cluster resource utilization if it gets "Pending Resource" from RM RESTful 
> APIs. In a certain extent, it represents how starving the applications are.
> Initially, we can add "pending resource" information in below two RM REST 
> APIs:
> {code:java}
> RMnode:port/ws/v1/cluster/metrics
> RMnode:port/ws/v1/cluster/nodes
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9464) Support "Pending Resource" metrics in RM's RESTful API

2019-04-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9464:

Attachment: YARN-9464-002.patch

> Support "Pending Resource" metrics in RM's RESTful API
> --
>
> Key: YARN-9464
> URL: https://issues.apache.org/jira/browse/YARN-9464
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9464-001.patch, YARN-9464-002.patch
>
>
> Knowing only the "available", "used" resource is not enough for YARN 
> management tools like auto-scaler. It would be helpful to diagnose the 
> cluster resource utilization if it gets "Pending Resource" from RM RESTful 
> APIs. In a certain extent, it represents how starving the applications are.
> Initially, we can add "pending resource" information in below two RM REST 
> APIs:
> {code:java}
> RMnode:port/ws/v1/cluster/metrics
> RMnode:port/ws/v1/cluster/nodes
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9467) TestCapacitySchedulerNodeLabelUpdate.testResourceUsageWhenNodeUpdatesPartition fails intermittent

2019-04-09 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9467:
---

 Summary: 
TestCapacitySchedulerNodeLabelUpdate.testResourceUsageWhenNodeUpdatesPartition 
fails intermittent
 Key: YARN-9467
 URL: https://issues.apache.org/jira/browse/YARN-9467
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler, test
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


TestCapacitySchedulerNodeLabelUpdate.testResourceUsageWhenNodeUpdatesPartition 
fails intermittent - observed in YARN-9464

{code:java}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate.checkUserUsedResource(TestCapacitySchedulerNodeLabelUpdate.java:191)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate.testResourceUsageWhenNodeUpdatesPartition(TestCapacitySchedulerNodeLabelUpdate.java:410)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9440) Improve diagnostics for scheduler and app activities

2019-04-09 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-9440:
---
Attachment: YARN-9440.001.patch

> Improve diagnostics for scheduler and app activities
> 
>
> Key: YARN-9440
> URL: https://issues.apache.org/jira/browse/YARN-9440
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9440.001.patch
>
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.cyw6zeehzqmx]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9379) Can't specify docker runtime through environment

2019-04-09 Thread caozhiqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814000#comment-16814000
 ] 

caozhiqiang commented on YARN-9379:
---

OK, your suggest is very good, and I will follow it. Thank you!

> Can't specify docker runtime through environment
> 
>
> Key: YARN-9379
> URL: https://issues.apache.org/jira/browse/YARN-9379
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.3.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Minor
> Attachments: YARN-9379-branch-3.2.0.001.patch, YARN-9379.002.patch, 
> YARN-9379.003.patch, YARN-9379.004.patch
>
>
> When use docker to run yarn containers, even though there is 
> docker.allowed.runtimes in container-executor.cfg, there are not parameter to 
> specify the docker runtime, such as gvisor, lxc or kata. With this patch, 
> client can add parameter such as 
> -[Dyarn.app.mapreduce.am|http://dyarn.app.mapreduce.am/].env.YARN_CONTAINER_RUNTIME_DOCKER_RUNTIME=runsc
>  to specify docker runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9439) Support asynchronized scheduling mode and multi-node lookup mechanism for app activities

2019-04-09 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813997#comment-16813997
 ] 

Tao Yang commented on YARN-9439:


Hi, [~cheersyang].
Findbugs warnings are not related to this patch. They seems not expect null 
input for SettableFuture#set even through it's declared as Nullable, perhaps 
findbugs can support javax.annotation.Nullable but not 
org.checkerframework.checker.nullness.qual.Nullable. I think we can exclude 
these warnings in findbugs-exclude.xml, can you share your thoughts about this?

> Support asynchronized scheduling mode and multi-node lookup mechanism for app 
> activities
> 
>
> Key: YARN-9439
> URL: https://issues.apache.org/jira/browse/YARN-9439
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9439.001.patch, YARN-9439.002.patch, 
> YARN-9439.003.patch
>
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m051gyiikx7c]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9466) App catalog navigation stylesheet does not display correctly in Safari

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813920#comment-16813920
 ] 

Hadoop QA commented on YARN-9466:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
28m 59s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
56s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9466 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965375/YARN-9466.001.patch |
| Optional Tests |  dupname  asflicense  shadedclient  |
| uname | Linux 85182118401c 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 25c421b |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 443 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-catalog/hadoop-yarn-applications-catalog-webapp
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23925/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> App catalog navigation stylesheet does not display correctly in Safari
> --
>
> Key: YARN-9466
> URL: https://issues.apache.org/jira/browse/YARN-9466
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9466.001.patch, catalog-chrome.png, 
> catalog-safari.png
>
>
> When navigation side bar has less content than right side table, the 
> navigation bar will shrink into smaller size in Safari.  See the attached 
> screenshot for the problem and desired looked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813900#comment-16813900
 ] 

Hudson commented on YARN-999:
-

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16369 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16369/])
YARN-999. In case of long running tasks, reduce node resource should (gifuma: 
rev 358e9286223029ba28f5afe0f7433d95a735b78f)
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java


> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch, 
> YARN-999.addendum.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9463) Add queueName info when failing with queue capacity sanity check

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813843#comment-16813843
 ] 

Hadoop QA commented on YARN-9463:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 17m 
42s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
16s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 52s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestResourceTrackerService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9463 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965360/YARN-9463.1.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0f9c7f7d44ac 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cfec455 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/23923/artifact/out/branch-mvninstall-root.txt
 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/23923/artifact/out/branch-findbugs-hadoop-yarn-project

[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813826#comment-16813826
 ] 

Giovanni Matteo Fumarola commented on YARN-999:
---

Committed the addendum to trunk. Thanks [~ste...@apache.org] for raising the 
issue and [~elgoiri] for fixing it.

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch, 
> YARN-999.addendum.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813822#comment-16813822
 ] 

Hadoop QA commented on YARN-6929:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 16m 
24s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
10s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 416 unchanged - 12 fixed = 416 total (was 428) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
52s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-6929 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965359/YARN-6929-008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle 

[jira] [Updated] (YARN-9466) App catalog navigation stylesheet does not display correctly in Safari

2019-04-09 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-9466:

Attachment: YARN-9466.001.patch

> App catalog navigation stylesheet does not display correctly in Safari
> --
>
> Key: YARN-9466
> URL: https://issues.apache.org/jira/browse/YARN-9466
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9466.001.patch, catalog-chrome.png, 
> catalog-safari.png
>
>
> When navigation side bar has less content than right side table, the 
> navigation bar will shrink into smaller size in Safari.  See the attached 
> screenshot for the problem and desired looked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813811#comment-16813811
 ] 

Íñigo Goiri commented on YARN-999:
--

Compiled locally.
Let's go with [^YARN-999.addendum.patch].

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch, 
> YARN-999.addendum.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9466) App catalog navigation stylesheet does not display correctly in Safari

2019-04-09 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-9466:

Attachment: catalog-safari.png

> App catalog navigation stylesheet does not display correctly in Safari
> --
>
> Key: YARN-9466
> URL: https://issues.apache.org/jira/browse/YARN-9466
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: catalog-chrome.png, catalog-safari.png
>
>
> When navigation side bar has less content than right side table, the 
> navigation bar will shrink into smaller size in Safari.  See the attached 
> screenshot for the problem and desired looked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9466) App catalog navigation stylesheet does not display correctly in Safari

2019-04-09 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-9466:

Attachment: catalog-chrome.png

> App catalog navigation stylesheet does not display correctly in Safari
> --
>
> Key: YARN-9466
> URL: https://issues.apache.org/jira/browse/YARN-9466
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: catalog-chrome.png, catalog-safari.png
>
>
> When navigation side bar has less content than right side table, the 
> navigation bar will shrink into smaller size in Safari.  See the attached 
> screenshot for the problem and desired looked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9379) Can't specify docker runtime through environment

2019-04-09 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813797#comment-16813797
 ] 

Eric Badger commented on YARN-9379:
---

Thanks for the update, [~caozhiqiang]. The findbugs is indeed unrelated. Don't 
worry about the version either. 

As for the patch, it looks pretty good. However, I have a few comments. 

All of the other environment variables that are used in 
{{DockerLinuxContainerRuntime.launchContainer()}} have a validate step against 
a yarn-site.xml config. This is so that we can fail fast without having to 
invoke the container executor if we know that we aren't going to want to launch 
with these environment variable configurations. The container-executor.cfg 
({{docker.allowed.runtimes}}) is the ultimate source of truth, but the 
yarn-site.xml configs act as a fail fast first line of defense. So it would be 
good to add that config and accompanying validation for the allowed runtimes.

On the unit test, it looks like you're checking the size of the created docker 
command, but not checking to see if the runtime was actually set correctly. 
Could you add some code to check that the runtime was correctly set to what you 
want it to be? You will also want to add a test case to make sure that the 
allowed list works. 

> Can't specify docker runtime through environment
> 
>
> Key: YARN-9379
> URL: https://issues.apache.org/jira/browse/YARN-9379
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.3.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Minor
> Attachments: YARN-9379-branch-3.2.0.001.patch, YARN-9379.002.patch, 
> YARN-9379.003.patch, YARN-9379.004.patch
>
>
> When use docker to run yarn containers, even though there is 
> docker.allowed.runtimes in container-executor.cfg, there are not parameter to 
> specify the docker runtime, such as gvisor, lxc or kata. With this patch, 
> client can add parameter such as 
> -[Dyarn.app.mapreduce.am|http://dyarn.app.mapreduce.am/].env.YARN_CONTAINER_RUNTIME_DOCKER_RUNTIME=runsc
>  to specify docker runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813795#comment-16813795
 ] 

Giovanni Matteo Fumarola commented on YARN-999:
---

The addendum looks ok.
Should we commit to unblock the branch or just wait yetus result?

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch, 
> YARN-999.addendum.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-999:
-
Attachment: YARN-999.addendum.patch

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch, 
> YARN-999.addendum.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9466) App catalog navigation stylesheet does not display correctly in Safari

2019-04-09 Thread Eric Yang (JIRA)
Eric Yang created YARN-9466:
---

 Summary: App catalog navigation stylesheet does not display 
correctly in Safari
 Key: YARN-9466
 URL: https://issues.apache.org/jira/browse/YARN-9466
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Eric Yang
Assignee: Eric Yang


When navigation side bar has less content than right side table, the navigation 
bar will shrink into smaller size in Safari.  See the attached screenshot for 
the problem and desired looked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813752#comment-16813752
 ] 

Íñigo Goiri commented on YARN-999:
--

Do you prefer addendum or a new JIRA?

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813745#comment-16813745
 ] 

Steve Loughran commented on YARN-999:
-

no worries, gone back one commit locally

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Steve Loughran (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened YARN-999:
-

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813740#comment-16813740
 ] 

Giovanni Matteo Fumarola commented on YARN-999:
---

Thanks [~ste...@apache.org], we are fixing it right now.

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9464) Support "Pending Resource" metrics in RM's RESTful API

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813737#comment-16813737
 ] 

Hadoop QA commented on YARN-9464:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
17s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 10 unchanged - 2 fixed = 11 total (was 12) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 27s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 11s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9464 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965342/YARN-9464-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b11c162342ff 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8ef3bc8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/23921/artifact/out

[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813733#comment-16813733
 ] 

Steve Loughran commented on YARN-999:
-

I think this has broken the build
{code}
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on 
project hadoop-sls: Compilation failure: Compilation failure: 
[ERROR] 
/Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java:[60,18]
 org.apache.hadoop.yarn.sls.nodemanager.NodeInfo.FakeRMNodeImpl is not abstract 
and does not override abstract method resetUpdatedCapability() in 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNode
[ERROR] 
/Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java:[47,8]
 org.apache.hadoop.yarn.sls.scheduler.RMNodeWrapper is not abstract and does 
not override abstract method resetUpdatedCapability() in 
org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeI
{code}

Can you fix this ASAP, so we don't have to roll things back.Or change the 
modified interface to have some default functions. 


> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813683#comment-16813683
 ] 

Hudson commented on YARN-999:
-

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16367 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16367/])
YARN-999. In case of long running tasks, reduce node resource should (gifuma: 
rev cfec455c452d85229ef2f9d83e6f7fc827946b59)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerOvercommit.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestAbstractYarnScheduler.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerOvercommit.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/hadoop-metrics2-resourcemanager.properties
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceOption.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerOvercommit.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/hadoop-metrics2.properties
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java


> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9435) Add Opportunistic Scheduler metrics in ResourceManager.

2019-04-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813679#comment-16813679
 ] 

Giovanni Matteo Fumarola commented on YARN-9435:


Thanks [~abmodi] for the patch.

Why do we need a Thread.sleep(1000); in the unit test?

> Add Opportunistic Scheduler metrics in ResourceManager.
> ---
>
> Key: YARN-9435
> URL: https://issues.apache.org/jira/browse/YARN-9435
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abhishek Modi
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-9435.001.patch, YARN-9435.002.patch, 
> YARN-9435.003.patch
>
>
> Right now there are no metrics available for Opportunistic Scheduler at 
> ResourceManager. As part of this jira, we will add metrics like number of 
> allocated opportunistic containers, released opportunistic containers, node 
> level allocations, rack level allocations etc. for Opportunistic Scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread JIRA


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813676#comment-16813676
 ] 

Íñigo Goiri commented on YARN-999:
--

Thank you very much [~giovanni.fumarola]!

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813674#comment-16813674
 ] 

Giovanni Matteo Fumarola commented on YARN-999:
---

Committed to trunk.

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.

2019-04-09 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-999:
--
Fix Version/s: 3.3.0

> In case of long running tasks, reduce node resource should balloon out 
> resource quickly by calling preemption API and suspending running task. 
> ---
>
> Key: YARN-999
> URL: https://issues.apache.org/jira/browse/YARN-999
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, nodemanager, scheduler
>Reporter: Junping Du
>Assignee: Íñigo Goiri
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-291.000.patch, YARN-999.001.patch, 
> YARN-999.002.patch, YARN-999.003.patch, YARN-999.004.patch, 
> YARN-999.005.patch, YARN-999.006.patch, YARN-999.007.patch, 
> YARN-999.008.patch, YARN-999.009.patch, YARN-999.010.patch
>
>
> In current design and implementation, when we decrease resource on node to 
> less than resource consumption of current running tasks, tasks can still be 
> running until the end. But just no new task get assigned on this node 
> (because AvailableResource < 0) until some tasks are finished and 
> AvailableResource > 0 again. This is good for most cases but in case of long 
> running task, it could be too slow for resource setting to actually work so 
> preemption could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2019-04-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813661#comment-16813661
 ] 

Prabhu Joseph commented on YARN-6929:
-

[~pbacsko] Thanks for reviewing. Have attached 008 patch addressing above 
comments. Please review the same when you get time.

Below are the details

1. During upgrade the job will have some log files in old app log dir and some 
in new app log dir. The reader logic has to return node log files from both 
places.

The logic returns an Iterator of node files only from new app log dir if 
{{yarn.nodemanager.remote-app-log-dir-include-older}} is false. Else, a 
combined iterator which traverses both old and new log files. {{nodeFilesPrev}} 
and {{nodeFilesCur}} are iterators of old and new app log dir respectively.

Have added comments and few changes in code to make it more readable.

2. Have used {{diagnosticsMsg}} in all places.

3. {{nodeFilesCur}} can be null only if there is an {{IOException}} (new app 
log dir does not exist or error when reading). In this case, throw the captured 
{{diagnosticsMsg}} else the {{nodeFilesCur}}.

4. {{diagnosticsMsg}} is appended max twice and also with 
{{IOException#getMessage()}} which is a limited one without stacktrace.

5. Have addressed this one.

 

> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-6929-007.patch, YARN-6929-008.patch, 
> YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, YARN-6929.3.patch, 
> YARN-6929.4.patch, YARN-6929.5.patch, YARN-6929.6.patch, YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366)
>  
> at 
> org

[jira] [Commented] (YARN-9463) Add queueName info when failing with queue capacity sanity check

2019-04-09 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813645#comment-16813645
 ] 

Aihua Xu commented on YARN-9463:


Simple fix: the error will print out queue info as well now. 

> Add queueName info when failing with queue capacity sanity check
> 
>
> Key: YARN-9463
> URL: https://issues.apache.org/jira/browse/YARN-9463
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 2.9.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
> Attachments: YARN-9463.1.patch
>
>
> In queue sanity check of CSQueueUtils.java , we are throwing "Illegal queue 
> capacity setting, (abs-capacity=0.00160782) > 
> (abs-maximum-capacity=0.0016027201). When label=[]". Better to add queue name 
> so admin can identify the problematic queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9463) Add queueName info when failing with queue capacity sanity check

2019-04-09 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated YARN-9463:
---
Attachment: YARN-9463.1.patch

> Add queueName info when failing with queue capacity sanity check
> 
>
> Key: YARN-9463
> URL: https://issues.apache.org/jira/browse/YARN-9463
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 2.9.1
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Trivial
> Attachments: YARN-9463.1.patch
>
>
> In queue sanity check of CSQueueUtils.java , we are throwing "Illegal queue 
> capacity setting, (abs-capacity=0.00160782) > 
> (abs-maximum-capacity=0.0016027201). When label=[]". Better to add queue name 
> so admin can identify the problematic queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2019-04-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-6929:

Attachment: YARN-6929-008.patch

> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-6929-007.patch, YARN-6929-008.patch, 
> YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, YARN-6929.3.patch, 
> YARN-6929.4.patch, YARN-6929.5.patch, YARN-6929.6.patch, YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) 
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenod

[jira] [Commented] (YARN-7848) Force removal of docker containers that do not get removed on first try

2019-04-09 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813627#comment-16813627
 ] 

Eric Yang commented on YARN-7848:
-

[~ebadger] [~Jim_Brennan] The failed unit test is not related to patch 003.  
Please review.  thanks

> Force removal of docker containers that do not get removed on first try
> ---
>
> Key: YARN-7848
> URL: https://issues.apache.org/jira/browse/YARN-7848
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-7848.001.patch, YARN-7848.002.patch, 
> YARN-7848.003.patch
>
>
> After the addition of YARN-5366, containers will get removed after a certain 
> debug delay. However, this is a one-time effort. If the removal fails for 
> whatever reason, the container will persist. We need to add a mechanism for a 
> forced removal of those containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9439) Support asynchronized scheduling mode and multi-node lookup mechanism for app activities

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813531#comment-16813531
 ] 

Hadoop QA commented on YARN-9439:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
15s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 76m 
51s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9439 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965321/YARN-9439.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0f0017efe62a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 73f43ac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/23919/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23919/testReport/ |
| Max. process+thread count | 886 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn

[jira] [Created] (YARN-9465) Support to alter table properties in HBaseTimelineSchemaCreator

2019-04-09 Thread Tarun Parimi (JIRA)
Tarun Parimi created YARN-9465:
--

 Summary: Support to alter table properties in 
HBaseTimelineSchemaCreator
 Key: YARN-9465
 URL: https://issues.apache.org/jira/browse/YARN-9465
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelinereader
Affects Versions: 3.2.0
Reporter: Tarun Parimi
Assignee: Tarun Parimi


HBaseTimelineSchemaCreator currently only creates tables if they don't exist. 
Only creating hbase tables without altering is the desired behavior for most of 
the use cases.

However, in certain scenarios we might need to alter tables.
 For example after upgrade, we might need to point to a new coprocessor jar in 
yarn.timeline-service.hbase.coprocessor.jar.hdfs.location . 
 A user might also want to change the ttl of the tables afterwards to control 
the data retention. Currently user has to manually find the tables related to 
atsv2 and alter them with hbase shell, which is not straightforward.

To support such scenarios, it will be useful to have an option to alter table 
if required in HBaseTimelineSchemaCreator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9464) Support "Pending Resource" metrics in RM's RESTful API

2019-04-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813460#comment-16813460
 ] 

Prabhu Joseph commented on YARN-9464:
-

[~tangzhankun]  Have updated pending resource for Cluster Metrics. I think it 
is not applicable for Node level metrics.

> Support "Pending Resource" metrics in RM's RESTful API
> --
>
> Key: YARN-9464
> URL: https://issues.apache.org/jira/browse/YARN-9464
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9464-001.patch
>
>
> Knowing only the "available", "used" resource is not enough for YARN 
> management tools like auto-scaler. It would be helpful to diagnose the 
> cluster resource utilization if it gets "Pending Resource" from RM RESTful 
> APIs. In a certain extent, it represents how starving the applications are.
> Initially, we can add "pending resource" information in below two RM REST 
> APIs:
> {code:java}
> RMnode:port/ws/v1/cluster/metrics
> RMnode:port/ws/v1/cluster/nodes
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7282) Shared Cache Phase 2

2019-04-09 Thread Laurenceau Julien (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813446#comment-16813446
 ] 

Laurenceau Julien commented on YARN-7282:
-

Hi, Any chance to extend (in the long term) this Yarn cache to support data 
caching like Apache Ignite ? 
Regards

 

> Shared Cache Phase 2
> 
>
> Key: YARN-7282
> URL: https://issues.apache.org/jira/browse/YARN-7282
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Chris Trezzo
>Priority: Major
>
> Phase 2 will address more features that need to be built as part of the 
> shared cache project. See YARN-1492 for the first release of the shared cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9464) Support "Pending Resource" metrics in RM's RESTful API

2019-04-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9464:

Attachment: YARN-9464-001.patch

> Support "Pending Resource" metrics in RM's RESTful API
> --
>
> Key: YARN-9464
> URL: https://issues.apache.org/jira/browse/YARN-9464
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9464-001.patch
>
>
> Knowing only the "available", "used" resource is not enough for YARN 
> management tools like auto-scaler. It would be helpful to diagnose the 
> cluster resource utilization if it gets "Pending Resource" from RM RESTful 
> APIs. In a certain extent, it represents how starving the applications are.
> Initially, we can add "pending resource" information in below two RM REST 
> APIs:
> {code:java}
> RMnode:port/ws/v1/cluster/metrics
> RMnode:port/ws/v1/cluster/nodes
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9464) Support "Pending Resource" metrics in RM's RESTful API

2019-04-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9464:

Description: 
Knowing only the "available", "used" resource is not enough for YARN management 
tools like auto-scaler. It would be helpful to diagnose the cluster resource 
utilization if it gets "Pending Resource" from RM RESTful APIs. In a certain 
extent, it represents how starving the applications are.

Initially, we can add "pending resource" information in below two RM REST APIs:
{code:java}
RMnode:port/ws/v1/cluster/metrics
RMnode:port/ws/v1/cluster/nodes
{code}
 

 

  was:
Knowing only the "available", "used" resource is not enough for YARN management 
tools like auto-scaler. It would be helpful to diagnose the cluster resource 
utilization if it gets "Pending Resource" from RM RESTful APIs. In a certain 
extent, it represents how starving the applications are.

Initially, we can add "pending resource" information in below two RM REST APIs:
{code:java}
RMnode:port/ws/v1/cluster/metrics
RMnode:port/ws/v1/cluster/metrics/nodes
{code}
 

 


> Support "Pending Resource" metrics in RM's RESTful API
> --
>
> Key: YARN-9464
> URL: https://issues.apache.org/jira/browse/YARN-9464
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Zhankun Tang
>Assignee: Prabhu Joseph
>Priority: Major
>
> Knowing only the "available", "used" resource is not enough for YARN 
> management tools like auto-scaler. It would be helpful to diagnose the 
> cluster resource utilization if it gets "Pending Resource" from RM RESTful 
> APIs. In a certain extent, it represents how starving the applications are.
> Initially, we can add "pending resource" information in below two RM REST 
> APIs:
> {code:java}
> RMnode:port/ws/v1/cluster/metrics
> RMnode:port/ws/v1/cluster/nodes
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8920) LogAggregation should be configurable to allow writing to underlying storage as appOwner or yarn user

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813406#comment-16813406
 ] 

Hadoop QA commented on YARN-8920:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-8920 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-8920 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945250/YARN-8920.6.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23920/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> LogAggregation should be configurable to allow writing to underlying storage 
> as appOwner or yarn user
> -
>
> Key: YARN-8920
> URL: https://issues.apache.org/jira/browse/YARN-8920
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8920.1.patch, YARN-8920.2.patch, YARN-8920.3.patch, 
> YARN-8920.4.patch, YARN-8920.5.patch, YARN-8920.6.patch
>
>
> Currently NM Log Aggregation does not support writing to underlying storage 
> as "yarn" user.  This would be needed while writing storages like S3 which do 
> not support POSIX compliant ACLs and a single access key would be used for 
> writes and app owners will be allowed to read the logs with their own access 
> keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9433) Remove unused constants from RMAuditLogger

2019-04-09 Thread Igor Rudenko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813355#comment-16813355
 ] 

Igor Rudenko commented on YARN-9433:


Regarding to hadoop-yetus 
[analysis|https://github.com/apache/hadoop/pull/706#issuecomment-480943190]

{{test4tests}}: unused code deleted no reason to add/delete unit any tests
{{findbugs}}: 2 warnings aren't related to code changes
{{unit}}: failures aren't caused by the fix
{{asflicense}}: license warning is about of absence of license header but 
modified file has the license header

> Remove unused constants from RMAuditLogger
> --
>
> Key: YARN-9433
> URL: https://issues.apache.org/jira/browse/YARN-9433
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Priority: Minor
>  Labels: newbie
>
> There are some unused constants in RMAuditLogger that the IntelliJ warns you 
> about.
> Currently what I'm seeing is that the following {{public static final 
> String}} constants are unused:
>  * AM_ALLOCATE
>  * CHANGE_CONTAINER_RESOURCE
>  * CREATE_NEW_RESERVATION_REQUEST
> Probably they are no longer needed. This task aims to remove those unused 
> constants.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9102) Log Aggregation is failing with S3A FileSystem for IFile Format

2019-04-09 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813330#comment-16813330
 ] 

Adam Antal commented on YARN-9102:
--

Could you please provide us with a full stack trace [~vamshikrishna.t]?

> Log Aggregation is failing with S3A FileSystem for IFile Format
> ---
>
> Key: YARN-9102
> URL: https://issues.apache.org/jira/browse/YARN-9102
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, nodemanager, resourcemanager, yarn
>Affects Versions: 3.1.1
>Reporter: VAMSHI KRISHNA
>Priority: Major
>
> Log aggregation for application is failing in hadoop when we configure Index 
> file format with S3A as filesystem. In nodemanager logs, its showing 
> FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8920) LogAggregation should be configurable to allow writing to underlying storage as appOwner or yarn user

2019-04-09 Thread Adam Antal (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813320#comment-16813320
 ] 

Adam Antal commented on YARN-8920:
--

Hi [~suma.shivaprasad],
Are you still working on this issue?

> LogAggregation should be configurable to allow writing to underlying storage 
> as appOwner or yarn user
> -
>
> Key: YARN-8920
> URL: https://issues.apache.org/jira/browse/YARN-8920
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, yarn
>Reporter: Suma Shivaprasad
>Assignee: Suma Shivaprasad
>Priority: Major
> Attachments: YARN-8920.1.patch, YARN-8920.2.patch, YARN-8920.3.patch, 
> YARN-8920.4.patch, YARN-8920.5.patch, YARN-8920.6.patch
>
>
> Currently NM Log Aggregation does not support writing to underlying storage 
> as "yarn" user.  This would be needed while writing storages like S3 which do 
> not support POSIX compliant ACLs and a single access key would be used for 
> writes and app owners will be allowed to read the logs with their own access 
> keys.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9439) Support asynchronized scheduling mode and multi-node lookup mechanism for app activities

2019-04-09 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813306#comment-16813306
 ] 

Tao Yang commented on YARN-9439:


Attached v3 patch to fix UT errors.

> Support asynchronized scheduling mode and multi-node lookup mechanism for app 
> activities
> 
>
> Key: YARN-9439
> URL: https://issues.apache.org/jira/browse/YARN-9439
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9439.001.patch, YARN-9439.002.patch, 
> YARN-9439.003.patch
>
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m051gyiikx7c]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9439) Support asynchronized scheduling mode and multi-node lookup mechanism for app activities

2019-04-09 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-9439:
---
Attachment: YARN-9439.003.patch

> Support asynchronized scheduling mode and multi-node lookup mechanism for app 
> activities
> 
>
> Key: YARN-9439
> URL: https://issues.apache.org/jira/browse/YARN-9439
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9439.001.patch, YARN-9439.002.patch, 
> YARN-9439.003.patch
>
>
> [Design 
> doc|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m051gyiikx7c]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6356) Allow different values of yarn.log-aggregation.retain-seconds for succeeded and failed jobs

2019-04-09 Thread Adam Antal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-6356:


Assignee: Adam Antal

> Allow different values of yarn.log-aggregation.retain-seconds for succeeded 
> and failed jobs
> ---
>
> Key: YARN-6356
> URL: https://issues.apache.org/jira/browse/YARN-6356
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation
>Reporter: Robert Kanter
>Assignee: Adam Antal
>Priority: Major
>
> It would be useful to have a value of {{yarn.log-aggregation.retain-seconds}} 
> for succeeded jobs and a different value for failed/killed jobs.  For jobs 
> that succeeded, you typically don't care about the logs, so a shorter 
> retention time is fine (and saves space/blocks in HDFS).  For jobs that 
> failed or were killed, the logs are much more important, and it's likely to 
> want to keep them around for longer so you have time to look at them.
> For instance, you could set it to keep logs for succeeded jobs for 1 day and 
> logs for failed/killed jobs for 1 week.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9443) Fast RM Failover using Ratis (Raft protocol)

2019-04-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813127#comment-16813127
 ] 

Prabhu Joseph commented on YARN-9443:
-

[~adam.antal] Thanks for checking this one. Have planned to prepare a design 
doc, will update.

> Fast RM Failover using Ratis (Raft protocol)
> 
>
> Key: YARN-9443
> URL: https://issues.apache.org/jira/browse/YARN-9443
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> During Failover, the RM Standby will have a lag as it has to recover from 
> Zookeeper / FileSystem StateStore. RM HA using Ratis (Raft Protocol) can 
> achieve Fast failover as all RMs are in sync already. This is used by Ozone - 
> HDDS-505.
>  
> cc [~nandakumar131]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-950) Ability to limit or avoid aggregating logs beyond a certain size

2019-04-09 Thread Adam Antal (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal reassigned YARN-950:
---

Assignee: Adam Antal

> Ability to limit or avoid aggregating logs beyond a certain size
> 
>
> Key: YARN-950
> URL: https://issues.apache.org/jira/browse/YARN-950
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: log-aggregation, nodemanager
>Affects Versions: 0.23.9, 2.6.0
>Reporter: Jason Lowe
>Assignee: Adam Antal
>Priority: Major
>
> It would be nice if ops could configure a cluster such that any container log 
> beyond a configured size would either only have a portion of the log 
> aggregated or not aggregated at all.  This would help speed up the recovery 
> path for cases where a container creates an enormous log and fills a disk, as 
> currently it tries to aggregate the entire, enormous log rather than only 
> aggregating a small portion or simply deleting it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9379) Can't specify docker runtime through environment

2019-04-09 Thread caozhiqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813113#comment-16813113
 ] 

caozhiqiang commented on YARN-9379:
---

Hello, [~ebadger], hadoopqa show findbugs in NodeHealthCheckerService.java, but 
I don't make any change in this file. And findbugs version show v3.1.0-RC1. 
Could you give me some suggest?

> Can't specify docker runtime through environment
> 
>
> Key: YARN-9379
> URL: https://issues.apache.org/jira/browse/YARN-9379
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.3.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Minor
> Attachments: YARN-9379-branch-3.2.0.001.patch, YARN-9379.002.patch, 
> YARN-9379.003.patch, YARN-9379.004.patch
>
>
> When use docker to run yarn containers, even though there is 
> docker.allowed.runtimes in container-executor.cfg, there are not parameter to 
> specify the docker runtime, such as gvisor, lxc or kata. With this patch, 
> client can add parameter such as 
> -[Dyarn.app.mapreduce.am|http://dyarn.app.mapreduce.am/].env.YARN_CONTAINER_RUNTIME_DOCKER_RUNTIME=runsc
>  to specify docker runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9379) Can't specify docker runtime through environment

2019-04-09 Thread caozhiqiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caozhiqiang updated YARN-9379:
--
Target Version/s:   (was: 3.2.1)

> Can't specify docker runtime through environment
> 
>
> Key: YARN-9379
> URL: https://issues.apache.org/jira/browse/YARN-9379
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.3.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Minor
> Attachments: YARN-9379-branch-3.2.0.001.patch, YARN-9379.002.patch, 
> YARN-9379.003.patch, YARN-9379.004.patch
>
>
> When use docker to run yarn containers, even though there is 
> docker.allowed.runtimes in container-executor.cfg, there are not parameter to 
> specify the docker runtime, such as gvisor, lxc or kata. With this patch, 
> client can add parameter such as 
> -[Dyarn.app.mapreduce.am|http://dyarn.app.mapreduce.am/].env.YARN_CONTAINER_RUNTIME_DOCKER_RUNTIME=runsc
>  to specify docker runtime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9379) Can't specify docker runtime through environment

2019-04-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813063#comment-16813063
 ] 

Hadoop QA commented on YARN-9379:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
58s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 in trunk has 2 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9379 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12965271/YARN-9379.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 45cc0766f0d8 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2d4f6b6 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-YARN-Build/23918/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-warnings.html
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23918/testReport/ |
| Max. process+thread count | 420 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-