date:20150620


[ 
https://issues.apache.org/jira/browse/YARN-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594538#comment-14594538
 ] 

Hadoop QA commented on YARN-3116:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 18s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   2m  1s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  50m 50s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  98m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740712/YARN-3116.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 20c03c9 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8298/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8298/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8298/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8298/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8298/console |


This message was automatically generated.

 [Collector wireup] We need an assured way to determine if a container is an 
 AM container on NM
 --

 Key: YARN-3116
 URL: https://issues.apache.org/jira/browse/YARN-3116
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, timelineserver
Reporter: Zhijie Shen
Assignee: Giovanni Matteo Fumarola
 Attachments: YARN-3116.patch


 In YARN-3030, to start the per-app aggregator only for a started AM 
 container,  we need to determine if the container is an AM container or not 
 from the context in NM (we can do it on RM). This information is missing, 
 such that we worked around to considered the container with ID _01 as 
 the AM container. Unfortunately, this is neither necessary or sufficient 
 condition. We need to have a way to determine if a container is an AM 
 container on NM. We can add flag to the container object or create an API to 
 do the judgement. Perhaps the distributed AM information may also be useful 
 to YARN-2877.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers


[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594468#comment-14594468
 ] 

Hadoop QA commented on YARN-3051:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m  0s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m  0s | The patch does not introduce 
any new Findbugs (version ) warnings. |
| | |  35m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740776/YARN-3051.Reader_API_4.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 20c03c9 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8295/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8295/console |


This message was automatically generated.

 [Storage abstraction] Create backing storage read interface for ATS readers
 ---

 Key: YARN-3051
 URL: https://issues.apache.org/jira/browse/YARN-3051
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3051-YARN-2928.003.patch, 
 YARN-3051-YARN-2928.03.patch, YARN-3051-YARN-2928.04.patch, 
 YARN-3051.Reader_API.patch, YARN-3051.Reader_API_1.patch, 
 YARN-3051.Reader_API_2.patch, YARN-3051.Reader_API_3.patch, 
 YARN-3051.Reader_API_4.patch, YARN-3051.wip.02.YARN-2928.patch, 
 YARN-3051.wip.patch, YARN-3051_temp.patch


 Per design in YARN-2928, create backing storage read interface that can be 
 implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3176) In Fair Scheduler, child queue should inherit maxApp from its parent


[ 
https://issues.apache.org/jira/browse/YARN-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594500#comment-14594500
 ] 

Hadoop QA commented on YARN-3176:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m 27s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 51s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 32s | The applied patch generated  
10 new checkstyle issues (total was 0, now 10). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 28s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  51m  7s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 43s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740742/YARN-3176.v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 20c03c9 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8297/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8297/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8297/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8297/console |


This message was automatically generated.

 In Fair Scheduler, child queue should inherit maxApp from its parent
 

 Key: YARN-3176
 URL: https://issues.apache.org/jira/browse/YARN-3176
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
Assignee: Siqi Li
 Attachments: YARN-3176.v1.patch, YARN-3176.v2.patch


 if the child queue does not have a maxRunningApp limit, it will use the 
 queueMaxAppsDefault. This behavior is not quite right, since 
 queueMaxAppsDefault is normally a small number, whereas some parent queues do 
 have maxRunningApp set to be more than the default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-20 Thread Varun Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3779:
---
Attachment: YARN-3779.03.patch

 Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
 in secure cluster
 --

 Key: YARN-3779
 URL: https://issues.apache.org/jira/browse/YARN-3779
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
 Environment: mrV2, secure mode
Reporter: Zhang Wei
Assignee: Varun Saxena
Priority: Critical
 Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
 YARN-3779.03.patch, log_aggr_deletion_on_refresh_error.log, 
 log_aggr_deletion_on_refresh_fix.log


 {{GSSException}} is thrown everytime log aggregation deletion is attempted 
 after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
 cluster.
 The problem can be reproduced by following steps:
 1. startup historyserver in secure cluster.
 2. Log deletion happens as per expectation. 
 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
 the configuration value.
 4. All the subsequent attempts of log deletion fail with {{GSSException}}
 Following exception can be found in historyserver's log if log deletion is 
 enabled. 
 {noformat}
 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
 deletion attempt is being aborted | AggregatedLogDeletionService.java:127
 java.io.IOException: Failed on local exception: java.io.IOException: 
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]; Host Details : local host is: vm-31/9.91.12.31; 
 destination host is: vm-33:25000; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
 at org.apache.hadoop.ipc.Client.call(Client.java:1414)
 at org.apache.hadoop.ipc.Client.call(Client.java:1363)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy9.getListing(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy10.getListing(Unknown Source)
 at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
 at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
 at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
 at java.util.TimerThread.mainLoop(Timer.java:555)
 at java.util.TimerThread.run(Timer.java:505)
 Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
 initiate failed [Caused by GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
 at 
 org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
 at org.apache.hadoop.ipc.Client.call(Client.java:1381)
 ... 21 more
 Caused by: javax.security.sasl.SaslException: GSS

[jira] [Commented] (YARN-3835) hadoop-yarn-server-resourcemanager test package bundles core-site.xml, yarn-site.xml


[ 
https://issues.apache.org/jira/browse/YARN-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594497#comment-14594497
 ] 

Hadoop QA commented on YARN-3835:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 43s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | yarn tests |  50m 50s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  85m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740772/YARN-3835.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / 20c03c9 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8296/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8296/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8296/console |


This message was automatically generated.

 hadoop-yarn-server-resourcemanager test package bundles core-site.xml, 
 yarn-site.xml
 

 Key: YARN-3835
 URL: https://issues.apache.org/jira/browse/YARN-3835
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Vamsee Yarlagadda
Assignee: Vamsee Yarlagadda
Priority: Minor
 Attachments: YARN-3835.patch


 It looks like by default yarn is bundling core-site.xml, yarn-site.xml in 
 test artifact of hadoop-yarn-server-resourcemanager which means that any 
 downstream project which uses this a dependency can have a problem in picking 
 up the user supplied/environment supplied core-site.xml, yarn-site.xml
 So we should ideally exclude these .xml files from being bundled into the 
 test-jar. (Similar to YARN-1748)
 I also proactively looked at other YARN modules where this might be 
 happening. 
 {code}
 vamsee-MBP:hadoop-yarn-project vamsee$ find . -name *-site.xml
 ./hadoop-yarn/conf/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/resources/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/resources/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-client/src/test/resources/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/core-site.xml
 {code}
 And out of these only two modules (hadoop-yarn-server-resourcemanager, 
 hadoop-yarn-server-tests) are building test-jars. In future, if we start 
 building test-jar of other modules, we should exclude these xml files from 
 being bundled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-20 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594551#comment-14594551
 ] 

Varun Saxena commented on YARN-3779:


Added a patch and submitted it, fixing both cases. This JIRA should move to 
MAPREDUCE. But not moving it because not sure if Jenkins will be able to post 
results of the submitted patch then

 Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
 in secure cluster
 --

 Key: YARN-3779
 URL: https://issues.apache.org/jira/browse/YARN-3779
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
 Environment: mrV2, secure mode
Reporter: Zhang Wei
Assignee: Varun Saxena
Priority: Critical
 Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
 YARN-3779.03.patch, log_aggr_deletion_on_refresh_error.log, 
 log_aggr_deletion_on_refresh_fix.log


 {{GSSException}} is thrown everytime log aggregation deletion is attempted 
 after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
 cluster.
 The problem can be reproduced by following steps:
 1. startup historyserver in secure cluster.
 2. Log deletion happens as per expectation. 
 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
 the configuration value.
 4. All the subsequent attempts of log deletion fail with {{GSSException}}
 Following exception can be found in historyserver's log if log deletion is 
 enabled. 
 {noformat}
 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
 deletion attempt is being aborted | AggregatedLogDeletionService.java:127
 java.io.IOException: Failed on local exception: java.io.IOException: 
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]; Host Details : local host is: vm-31/9.91.12.31; 
 destination host is: vm-33:25000; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
 at org.apache.hadoop.ipc.Client.call(Client.java:1414)
 at org.apache.hadoop.ipc.Client.call(Client.java:1363)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy9.getListing(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy10.getListing(Unknown Source)
 at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
 at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
 at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
 at java.util.TimerThread.mainLoop(Timer.java:555)
 at java.util.TimerThread.run(Timer.java:505)
 Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
 initiate failed [Caused by GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
 at 
 org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)

[jira] [Updated] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2

2015-06-20 Thread Naganarasimha G R (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-3792:

Attachment: YARN-3792-YARN-2928.003.patch

Hi [~sjlee0], fixed 2 nits which you mentioned and seems like test case 
failures are not related to this jira, will check again in the next run .

 Test case failures in TestDistributedShell and some issue fixes related to 
 ATSV2
 

 Key: YARN-3792
 URL: https://issues.apache.org/jira/browse/YARN-3792
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: YARN-3792-YARN-2928.001.patch, 
 YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch


 # encountered [testcase 
 failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
 which was happening even without the patch modifications in YARN-3044
 TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow
 TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow
 TestDistributedShellWithNodeLabels.testDSShellWithNodeLabelExpression
 # Remove unused {{enableATSV1}} in testDisstributedShell
 # container metrics needs to be published only for v2 test cases of 
 testDisstributedShell
 # Nullpointer was thrown in TimelineClientImpl.constructResURI when Aux 
 service was not configured and {{TimelineClient.putObjects}} was getting 
 invoked.
 # Race condition for the Application events to published and test case 
 verification for RM's ApplicationFinished Timeline Events
 # Application Tags for converted to lowercase in 
 ApplicationSubmissionContextPBimpl, hence RMTimelinecollector was not able to 
 detect to custom flow details of the app



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-20 Thread Jun Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594424#comment-14594424
 ] 

Jun Gong commented on YARN-3809:


Attach a new patch to address [~jlowe] 's suggestions. Thanks for the review.

 Failed to launch new attempts because ApplicationMasterLauncher's threads all 
 hang
 --

 Key: YARN-3809
 URL: https://issues.apache.org/jira/browse/YARN-3809
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3809.01.patch, YARN-3809.02.patch, 
 YARN-3809.03.patch


 ApplicationMasterLauncher create a thread pool whose size is 10 to deal with 
 AMLauncherEventType(LAUNCH and CLEANUP).
 In our cluster, there was many NM with 10+ AM running on it, and one shut 
 down for some reason. After RM found the NM LOST, it cleaned up AMs running 
 on it. Then ApplicationMasterLauncher need handle these 10+ CLEANUP event. 
 ApplicationMasterLauncher's thread pool would be filled up, and they all hang 
 in the code containerMgrProxy.stopContainers(stopRequest) because NM was 
 down, the default RPC time out is 15 mins. It means that in 15 mins 
 ApplicationMasterLauncher could not handle new event such as LAUNCH, then new 
 attempts will fails to launch because of time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3831) Localization failed when a local disk turns from bad to good without NM initializes it

2015-06-20 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594431#comment-14594431
 ] 

zhihai xu commented on YARN-3831:
-

Hi [~hex108], thanks for reporting this issue, What version is your code? 
YARN-3491 fixed a race condition when a local disk turns from bad to good.

 Localization failed when a local disk turns from bad to good without NM 
 initializes it
 --

 Key: YARN-3831
 URL: https://issues.apache.org/jira/browse/YARN-3831
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong

 A local disk turns from bad to good without NM initializes it(create 
 /path-to-local-dir/usercache and /path-to-local-dir/filecache). When 
 localizing a container, container-executor will try to create directories 
 under /path-to-local-dir/usercache, and it will fail. Then container's 
 localization will fail. 
 Related log is as following:
 {noformat}
 2015-06-19 18:00:01,205 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Created localizer for container_1431957472783_38706012_01_000465
 2015-06-19 18:00:01,212 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Writing credentials to the nmPrivate file 
 /data8/yarnenv/local/nmPrivate/container_1431957472783_38706012_01_000465.tokens.
  Credentials list: 
 2015-06-19 18:00:01,216 WARN 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code 
 from container container_1431957472783_38706012_01_000465 startLocalizer is : 
 20
 org.apache.hadoop.util.Shell$ExitCodeException: 
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:205)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:981)
 2015-06-19 18:00:01,216 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : command 
 provided 0
 2015-06-19 18:00:01,216 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: main : user is 
 tdwadmin
 2015-06-19 18:00:01,216 INFO 
 org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Failed to create 
 directory /data2/yarnenv/local/usercache/tdwadmin - No such file or directory
 2015-06-19 18:00:01,216 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  Localizer failed
 java.io.IOException: Application application_1431957472783_38706012 
 initialization failed (exitCode=20) with output: main : command provided 0
 main : user is tdwadmin
 Failed to create directory /data2/yarnenv/local/usercache/tdwadmin - No such 
 file or directory
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:214)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:981)
 Caused by: org.apache.hadoop.util.Shell$ExitCodeException: 
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
 at org.apache.hadoop.util.Shell.run(Shell.java:379)
 at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
 at 
 org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.startLocalizer(LinuxContainerExecutor.java:205)
 ... 1 more
 2015-06-19 18:00:01,216 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container:
  Container container_1431957472783_38706012_01_000465 transitioned from 
 LOCALIZING to LOCALIZATION_FAILED
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-20 Thread Jun Gong (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-3809:
---
Attachment: YARN-3809.03.patch

 Failed to launch new attempts because ApplicationMasterLauncher's threads all 
 hang
 --

 Key: YARN-3809
 URL: https://issues.apache.org/jira/browse/YARN-3809
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3809.01.patch, YARN-3809.02.patch, 
 YARN-3809.03.patch


 ApplicationMasterLauncher create a thread pool whose size is 10 to deal with 
 AMLauncherEventType(LAUNCH and CLEANUP).
 In our cluster, there was many NM with 10+ AM running on it, and one shut 
 down for some reason. After RM found the NM LOST, it cleaned up AMs running 
 on it. Then ApplicationMasterLauncher need handle these 10+ CLEANUP event. 
 ApplicationMasterLauncher's thread pool would be filled up, and they all hang 
 in the code containerMgrProxy.stopContainers(stopRequest) because NM was 
 down, the default RPC time out is 15 mins. It means that in 15 mins 
 ApplicationMasterLauncher could not handle new event such as LAUNCH, then new 
 attempts will fails to launch because of time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-20 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594423#comment-14594423
 ] 

Varun Saxena commented on YARN-3779:


[~vinodkv], thats correct.
So do you want me to raise another JIRA for that ? Or do it as part of this one 
only ?

 Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
 in secure cluster
 --

 Key: YARN-3779
 URL: https://issues.apache.org/jira/browse/YARN-3779
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
 Environment: mrV2, secure mode
Reporter: Zhang Wei
Assignee: Varun Saxena
Priority: Critical
 Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
 log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log


 {{GSSException}} is thrown everytime log aggregation deletion is attempted 
 after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
 cluster.
 The problem can be reproduced by following steps:
 1. startup historyserver in secure cluster.
 2. Log deletion happens as per expectation. 
 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
 the configuration value.
 4. All the subsequent attempts of log deletion fail with {{GSSException}}
 Following exception can be found in historyserver's log if log deletion is 
 enabled. 
 {noformat}
 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
 deletion attempt is being aborted | AggregatedLogDeletionService.java:127
 java.io.IOException: Failed on local exception: java.io.IOException: 
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]; Host Details : local host is: vm-31/9.91.12.31; 
 destination host is: vm-33:25000; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
 at org.apache.hadoop.ipc.Client.call(Client.java:1414)
 at org.apache.hadoop.ipc.Client.call(Client.java:1363)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy9.getListing(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 at com.sun.proxy.$Proxy10.getListing(Unknown Source)
 at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
 at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
 at 
 org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
 at 
 org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
 at java.util.TimerThread.mainLoop(Timer.java:555)
 at java.util.TimerThread.run(Timer.java:505)
 Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
 initiate failed [Caused by GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos tgt)]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
 at 
 org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
 at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
 at 
 org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
 at

[jira] [Created] (YARN-3837) javadocs of TimelineAuthenticationFilterInitializer give wrong prefix for auth options

2015-06-20 Thread Steve Loughran (JIRA)

Steve Loughran created YARN-3837:


 Summary: javadocs of TimelineAuthenticationFilterInitializer give 
wrong prefix for auth options
 Key: YARN-3837
 URL: https://issues.apache.org/jira/browse/YARN-3837
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor


The javadocs for {{TimelineAuthenticationFilterInitializer}} talk about the 
prefix {{yarn.timeline-service.authentication.}}, but the code uses {{ 
yarn.timeline-service.http-authentication.}}  as the prefix.

best to use {{@value}} and let the javadocs sort it out for themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Moved] (YARN-3838) Rest API failing when ip configured in RM address in secure https mode


 [ 
https://issues.apache.org/jira/browse/YARN-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt moved HADOOP-12096 to YARN-3838:
-

Component/s: (was: net)
 (was: security)
 security
Key: YARN-3838  (was: HADOOP-12096)
Project: Hadoop YARN  (was: Hadoop Common)

 Rest API failing when ip configured in RM address in secure https mode
 --

 Key: YARN-3838
 URL: https://issues.apache.org/jira/browse/YARN-3838
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Critical
 Attachments: 0001-HADOOP-12096.patch, 0001-YARN-3810.patch, 
 0002-YARN-3810.patch


 Steps to reproduce
 ===
 1.Configure hadoop.http.authentication.kerberos.principal as below
 {code:xml}
   property
 namehadoop.http.authentication.kerberos.principal/name
 valueHTTP/_h...@hadoop.com/value
   /property
 {code}
 2. In RM web address also configure IP 
 3. Startup RM 
 Call Rest API for RM  {{ curl -i -k  --insecure --negotiate -u : https IP 
 /ws/v1/cluster/info}}
 *Actual*
 Rest API  failing
 {code}
 2015-06-16 19:03:49,845 DEBUG 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
 Authentication exception: GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos credentails)
 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos credentails)
   at 
 org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
   at 
 org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:348)
   at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:519)
   at 
 org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster


[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594783#comment-14594783
 ] 

Hadoop QA commented on YARN-3779:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |   5m 53s | Tests passed in 
hadoop-mapreduce-client-hs. |
| | |  43m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740836/YARN-3779.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 055cd5a |
| hadoop-mapreduce-client-hs test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8301/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8301/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8301/console |


This message was automatically generated.

 Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
 in secure cluster
 --

 Key: YARN-3779
 URL: https://issues.apache.org/jira/browse/YARN-3779
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
 Environment: mrV2, secure mode
Reporter: Zhang Wei
Assignee: Varun Saxena
Priority: Critical
 Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
 YARN-3779.03.patch, log_aggr_deletion_on_refresh_error.log, 
 log_aggr_deletion_on_refresh_fix.log


 {{GSSException}} is thrown everytime log aggregation deletion is attempted 
 after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
 cluster.
 The problem can be reproduced by following steps:
 1. startup historyserver in secure cluster.
 2. Log deletion happens as per expectation. 
 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
 the configuration value.
 4. All the subsequent attempts of log deletion fail with {{GSSException}}
 Following exception can be found in historyserver's log if log deletion is 
 enabled. 
 {noformat}
 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
 deletion attempt is being aborted | AggregatedLogDeletionService.java:127
 java.io.IOException: Failed on local exception: java.io.IOException: 
 javax.security.sasl.SaslException: GSS initiate failed [Caused by 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos tgt)]; Host Details : local host is: vm-31/9.91.12.31; 
 destination host is: vm-33:25000; 
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
 at org.apache.hadoop.ipc.Client.call(Client.java:1414)
 at org.apache.hadoop.ipc.Client.call(Client.java:1363)
 at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 at com.sun.proxy.$Proxy9.getListing(Unknown Source)
 at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at

[jira] [Updated] (YARN-3838) Rest API failing when ip configured in RM address in secure https mode


 [ 
https://issues.apache.org/jira/browse/YARN-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3838:
---
Component/s: (was: security)
 webapp

 Rest API failing when ip configured in RM address in secure https mode
 --

 Key: YARN-3838
 URL: https://issues.apache.org/jira/browse/YARN-3838
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Critical
 Attachments: 0001-HADOOP-12096.patch, 0001-YARN-3810.patch, 
 0002-YARN-3810.patch


 Steps to reproduce
 ===
 1.Configure hadoop.http.authentication.kerberos.principal as below
 {code:xml}
   property
 namehadoop.http.authentication.kerberos.principal/name
 valueHTTP/_h...@hadoop.com/value
   /property
 {code}
 2. In RM web address also configure IP 
 3. Startup RM 
 Call Rest API for RM  {{ curl -i -k  --insecure --negotiate -u : https IP 
 /ws/v1/cluster/info}}
 *Actual*
 Rest API  failing
 {code}
 2015-06-16 19:03:49,845 DEBUG 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
 Authentication exception: GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos credentails)
 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos credentails)
   at 
 org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
   at 
 org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:348)
   at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:519)
   at 
 org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang


[ 
https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594746#comment-14594746
 ] 

Hadoop QA commented on YARN-3809:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m 48s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 52s | The applied patch generated  1 
new checkstyle issues (total was 211, now 211). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 25s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |  50m 45s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  98m 47s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740804/YARN-3809.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bcb3c40 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8299/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8299/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8299/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8299/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8299/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8299/console |


This message was automatically generated.

 Failed to launch new attempts because ApplicationMasterLauncher's threads all 
 hang
 --

 Key: YARN-3809
 URL: https://issues.apache.org/jira/browse/YARN-3809
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Jun Gong
Assignee: Jun Gong
 Attachments: YARN-3809.01.patch, YARN-3809.02.patch, 
 YARN-3809.03.patch


 ApplicationMasterLauncher create a thread pool whose size is 10 to deal with 
 AMLauncherEventType(LAUNCH and CLEANUP).
 In our cluster, there was many NM with 10+ AM running on it, and one shut 
 down for some reason. After RM found the NM LOST, it cleaned up AMs running 
 on it. Then ApplicationMasterLauncher need handle these 10+ CLEANUP event. 
 ApplicationMasterLauncher's thread pool would be filled up, and they all hang 
 in the code containerMgrProxy.stopContainers(stopRequest) because NM was 
 down, the default RPC time out is 15 mins. It means that in 15 mins 
 ApplicationMasterLauncher could not handle new event such as LAUNCH, then new 
 attempts will fails to launch because of time out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3837) javadocs of TimelineAuthenticationFilterInitializer give wrong prefix for auth options


 [ 
https://issues.apache.org/jira/browse/YARN-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-3837:
---
Attachment: 0001-YARN-3837.patch

Attaching patch for the same. Please assign to me if its fine

 javadocs of TimelineAuthenticationFilterInitializer give wrong prefix for 
 auth options
 --

 Key: YARN-3837
 URL: https://issues.apache.org/jira/browse/YARN-3837
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor
 Attachments: 0001-YARN-3837.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The javadocs for {{TimelineAuthenticationFilterInitializer}} talk about the 
 prefix {{yarn.timeline-service.authentication.}}, but the code uses {{ 
 yarn.timeline-service.http-authentication.}}  as the prefix.
 best to use {{@value}} and let the javadocs sort it out for themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3838) Rest API failing when ip configured in RM address in secure https mode


[ 
https://issues.apache.org/jira/browse/YARN-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594801#comment-14594801
 ] 

Bibin A Chundatt commented on YARN-3838:


Typo in earlier comments 
{quote}
As per the discussion till now we should handle in HttpServer2
{quote}
As per the discussion till now we should handle in HttpServer2.builder 

 Rest API failing when ip configured in RM address in secure https mode
 --

 Key: YARN-3838
 URL: https://issues.apache.org/jira/browse/YARN-3838
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Critical
 Attachments: 0001-HADOOP-12096.patch, 0001-YARN-3810.patch, 
 0002-YARN-3810.patch


 Steps to reproduce
 ===
 1.Configure hadoop.http.authentication.kerberos.principal as below
 {code:xml}
   property
 namehadoop.http.authentication.kerberos.principal/name
 valueHTTP/_h...@hadoop.com/value
   /property
 {code}
 2. In RM web address also configure IP 
 3. Startup RM 
 Call Rest API for RM  {{ curl -i -k  --insecure --negotiate -u : https IP 
 /ws/v1/cluster/info}}
 *Actual*
 Rest API  failing
 {code}
 2015-06-16 19:03:49,845 DEBUG 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
 Authentication exception: GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos credentails)
 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos credentails)
   at 
 org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
   at 
 org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:348)
   at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:519)
   at 
 org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2


[ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594778#comment-14594778
 ] 

Hadoop QA commented on YARN-3792:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 30s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 57s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 40s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  1s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 46s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m  6s | The patch appears to introduce 8 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   8m 11s | Tests passed in 
hadoop-yarn-applications-distributedshell. |
| {color:green}+1{color} | yarn tests |   2m  3s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   6m  8s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | yarn tests |  52m 43s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| {color:green}+1{color} | yarn tests |   1m 22s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | | 116m 36s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-applications-distributedshell |
| FindBugs | module:hadoop-yarn-server-resourcemanager |
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740818/YARN-3792-YARN-2928.003.patch
 |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | YARN-2928 / 8c036a1 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-applications-distributedshell.html
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
 |
| hadoop-yarn-applications-distributedshell test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8300/console |


This message was automatically generated.

 Test case failures in TestDistributedShell and some issue fixes related to 
 ATSV2
 

 Key: YARN-3792
 URL: https://issues.apache.org/jira/browse/YARN-3792
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: YARN-3792-YARN-2928.001.patch, 
 YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch


 # encountered [testcase 
 failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
 which was happening even without the patch modifications in YARN-3044

[jira] [Commented] (YARN-3838) Rest API failing when ip configured in RM address in secure https mode


[ 
https://issues.apache.org/jira/browse/YARN-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594737#comment-14594737
 ] 

Bibin A Chundatt commented on YARN-3838:


In case of resourcemanager the httpserver is started as below and the url used 
is just the ip address
{{WebApps#start}}
{code}
 HttpServer2.Builder builder = new HttpServer2.Builder()
.setName(name)
.addEndpoint(
URI.create(httpScheme + bindAddress
+ : + port)).setConf(conf).setFindPort(findPort)
.setACL(new AccessControlList(conf.get(
  YarnConfiguration.YARN_ADMIN_ACL, 
  YarnConfiguration.DEFAULT_YARN_ADMIN_ACL)))
.setPathSpec(pathList.toArray(new String[0]));

{code}

Comparing the same to hdfs side for NameNode the URL is formed as below

{{DFSUtil#httpServerTemplateForNNAndJN}} 
{code}
  URI uri = URI.create(http://; + NetUtils.getHostPortString(httpAddr));
{code}

Seems like this is reason why there is a difference in both hdfs and yarn for 
*REST api functionality when IP is configured in kerberos mode*. In case of 
hdfs it works but yarn its doesnt.

Can we hange RM HTTPServer2.builder as velow

{code}
  HttpServer2.Builder builder =
new HttpServer2.Builder()
.setName(name)
.addEndpoint(
URI.create(httpScheme
+ NetUtils.getHostPortString(new InetSocketAddress(
bindAddress, port
.setConf(conf)
.setFindPort(findPort)
.setACL(
new AccessControlList(conf.get(
YarnConfiguration.YARN_ADMIN_ACL,
YarnConfiguration.DEFAULT_YARN_ADMIN_ACL)))
.setPathSpec(pathList.toArray(new String[0]));
{code}

Please do correct me if i am wrong .

 Rest API failing when ip configured in RM address in secure https mode
 --

 Key: YARN-3838
 URL: https://issues.apache.org/jira/browse/YARN-3838
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Reporter: Bibin A Chundatt
Assignee: Bibin A Chundatt
Priority: Critical
 Attachments: 0001-HADOOP-12096.patch, 0001-YARN-3810.patch, 
 0002-YARN-3810.patch


 Steps to reproduce
 ===
 1.Configure hadoop.http.authentication.kerberos.principal as below
 {code:xml}
   property
 namehadoop.http.authentication.kerberos.principal/name
 valueHTTP/_h...@hadoop.com/value
   /property
 {code}
 2. In RM web address also configure IP 
 3. Startup RM 
 Call Rest API for RM  {{ curl -i -k  --insecure --negotiate -u : https IP 
 /ws/v1/cluster/info}}
 *Actual*
 Rest API  failing
 {code}
 2015-06-16 19:03:49,845 DEBUG 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter: 
 Authentication exception: GSSException: No valid credentials provided 
 (Mechanism level: Failed to find any Kerberos credentails)
 org.apache.hadoop.security.authentication.client.AuthenticationException: 
 GSSException: No valid credentials provided (Mechanism level: Failed to find 
 any Kerberos credentails)
   at 
 org.apache.hadoop.security.authentication.server.KerberosAuthenticationHandler.authenticate(KerberosAuthenticationHandler.java:399)
   at 
 org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:348)
   at 
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:519)
   at 
 org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (YARN-3837) javadocs of TimelineAuthenticationFilterInitializer give wrong prefix for auth options


 [ 
https://issues.apache.org/jira/browse/YARN-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt reassigned YARN-3837:
--

Assignee: Bibin A Chundatt

 javadocs of TimelineAuthenticationFilterInitializer give wrong prefix for 
 auth options
 --

 Key: YARN-3837
 URL: https://issues.apache.org/jira/browse/YARN-3837
 Project: Hadoop YARN
  Issue Type: Bug
  Components: timelineserver
Affects Versions: 2.8.0
Reporter: Steve Loughran
Assignee: Bibin A Chundatt
Priority: Minor
 Attachments: 0001-YARN-3837.patch

   Original Estimate: 0.5h
  Remaining Estimate: 0.5h

 The javadocs for {{TimelineAuthenticationFilterInitializer}} talk about the 
 prefix {{yarn.timeline-service.authentication.}}, but the code uses {{ 
 yarn.timeline-service.http-authentication.}}  as the prefix.
 best to use {{@value}} and let the javadocs sort it out for themselves.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3835) hadoop-yarn-server-resourcemanager test package bundles core-site.xml, yarn-site.xml

2015-06-20 Thread Vamsee Yarlagadda (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594876#comment-14594876
 ] 

Vamsee Yarlagadda commented on YARN-3835:
-

Manually verified tests.jar to see missing core-site.xml, yarn-site.xml

 hadoop-yarn-server-resourcemanager test package bundles core-site.xml, 
 yarn-site.xml
 

 Key: YARN-3835
 URL: https://issues.apache.org/jira/browse/YARN-3835
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Vamsee Yarlagadda
Assignee: Vamsee Yarlagadda
Priority: Minor
 Attachments: YARN-3835.patch


 It looks like by default yarn is bundling core-site.xml, yarn-site.xml in 
 test artifact of hadoop-yarn-server-resourcemanager which means that any 
 downstream project which uses this a dependency can have a problem in picking 
 up the user supplied/environment supplied core-site.xml, yarn-site.xml
 So we should ideally exclude these .xml files from being bundled into the 
 test-jar. (Similar to YARN-1748)
 I also proactively looked at other YARN modules where this might be 
 happening. 
 {code}
 vamsee-MBP:hadoop-yarn-project vamsee$ find . -name *-site.xml
 ./hadoop-yarn/conf/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/resources/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/resources/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-client/src/test/resources/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/core-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/target/test-classes/yarn-site.xml
 ./hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/resources/core-site.xml
 {code}
 And out of these only two modules (hadoop-yarn-server-resourcemanager, 
 hadoop-yarn-server-tests) are building test-jars. In future, if we start 
 building test-jar of other modules, we should exclude these xml files from 
 being bundled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3806) Proposal of Generic Scheduling Framework for YARN

[
https://issues.apache.org/jira/browse/YARN-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Wei Shao updated YARN-3806:
---
Attachment: ProposalOfGenericSchedulingFrameworkForYARN-V1.06.pdf

Proposal of Generic Scheduling Framework for YARN
-

Key: YARN-3806
URL: https://issues.apache.org/jira/browse/YARN-3806
Project: Hadoop YARN
Issue Type: Improvement
Components: scheduler
Reporter: Wei Shao
Attachments: ProposalOfGenericSchedulingFrameworkForYARN-V1.05.pdf,
ProposalOfGenericSchedulingFrameworkForYARN-V1.06.pdf

Currently, a typical YARN cluster runs many different kinds of applications:
production applications, ad hoc user applications, long running services and
so on. Different YARN scheduling policies may be suitable for different
applications. For example, capacity scheduling can manage production
applications well since application can get guaranteed resource share, fair
scheduling can manage ad hoc user applications well since it can enforce
fairness among users. However, current YARN scheduling framework doesn’t have
a mechanism for multiple scheduling policies work hierarchically in one
cluster.
YARN-3306 talked about many issues of today’s YARN scheduling framework, and
proposed a per-queue policy driven framework. In detail, it supported
different scheduling policies for leaf queues. However, support of different
scheduling policies for upper level queues is not seriously considered yet.
A generic scheduling framework is proposed here to address these limitations.
It supports different policies (fair, capacity, fifo and so on) for any queue
consistently. The proposal tries to solve many other issues in current YARN
scheduling framework as well.
Two new proposed scheduling policies YARN-3807 YARN-3808 are based on
generic scheduling framework brought up in this proposal.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3807) Proposal of Guaranteed Capacity Scheduling for YARN


 [ 
https://issues.apache.org/jira/browse/YARN-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Shao updated YARN-3807:
---
Attachment: ProposalOfGuaranteedCapacitySchedulingForYARN-V1.05.pdf

 Proposal of Guaranteed Capacity Scheduling for YARN
 ---

 Key: YARN-3807
 URL: https://issues.apache.org/jira/browse/YARN-3807
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler, fairscheduler
Reporter: Wei Shao
 Attachments: ProposalOfGuaranteedCapacitySchedulingForYARN-V1.04.pdf, 
 ProposalOfGuaranteedCapacitySchedulingForYARN-V1.05.pdf


 This proposal talks about limitations of the YARN scheduling policies for SLA 
 applications, and tries to solve them by YARN-3806 and the new scheduling 
 policy called guaranteed capacity scheduling.
 Guaranteed capacity scheduling makes guarantee to the applications that they 
 can get resources under specified capacity cap in totally predictable manner. 
 The application can meet SLA more easily since it is self-contained in the 
 shared cluster - external uncertainties are eliminated.
 For example, suppose queue A has initial capacity 100G memory, and there are 
 two pending applications 1 and 2, 1’s specified capacity is 70G, 2’s 
 specified capacity is 50G. Queue A may accept application 1 to run first and 
 makes guarantee that 1 can get resources exponentially up to its capacity and 
 won’t be preempted (if allocation of 1 is 5G in scheduling cycle N, demand is 
 80G, exponential factor is 2. In N+1, it can get 5G, in N+2, it can get 10G, 
 in N+3, it can get 20G, and in N+4, it can get 30G, reach its capacity). 
 Later, when the cluster is free, queue A may decide to scale up by increasing 
 its capacity to 120G, so it can accept application 2 and make guarantee to it 
 as well. Queue A can scale down to its initial capacity when any application 
 completes.
 Guaranteed capacity scheduling also has other features that the example 
 doesn’t illustrate. See proposal for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-3792) Test case failures in TestDistributedShell and some issue fixes related to ATSV2

2015-06-20 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594821#comment-14594821
 ] 

Naganarasimha G R commented on YARN-3792:
-

* Test case reported is not due to this patch and already YARN-3790 has been 
raised to address it.
* white space is not caused by this patch
* incorrect findbugs alert, report has no issues

[~sjlee0], i think its good state now !

 Test case failures in TestDistributedShell and some issue fixes related to 
 ATSV2
 

 Key: YARN-3792
 URL: https://issues.apache.org/jira/browse/YARN-3792
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Naganarasimha G R
Assignee: Naganarasimha G R
 Attachments: YARN-3792-YARN-2928.001.patch, 
 YARN-3792-YARN-2928.002.patch, YARN-3792-YARN-2928.003.patch


 # encountered [testcase 
 failures|https://builds.apache.org/job/PreCommit-YARN-Build/8233/testReport/] 
 which was happening even without the patch modifications in YARN-3044
 TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow
 TestDistributedShell.testDSShellWithoutDomainV2DefaultFlow
 TestDistributedShellWithNodeLabels.testDSShellWithNodeLabelExpression
 # Remove unused {{enableATSV1}} in testDisstributedShell
 # container metrics needs to be published only for v2 test cases of 
 testDisstributedShell
 # Nullpointer was thrown in TimelineClientImpl.constructResURI when Aux 
 service was not configured and {{TimelineClient.putObjects}} was getting 
 invoked.
 # Race condition for the Application events to published and test case 
 verification for RM's ApplicationFinished Timeline Events
 # Application Tags for converted to lowercase in 
 ApplicationSubmissionContextPBimpl, hence RMTimelinecollector was not able to 
 detect to custom flow details of the app



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3808) Proposal of Time Extended Fair Scheduling for YARN


 [ 
https://issues.apache.org/jira/browse/YARN-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Shao updated YARN-3808:
---
Attachment: ProposalOfTimeExtendedFairSchedulingForYARN-V1.03.pdf

 Proposal of Time Extended Fair Scheduling for YARN
 --

 Key: YARN-3808
 URL: https://issues.apache.org/jira/browse/YARN-3808
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler, scheduler
Reporter: Wei Shao
 Attachments: ProposalOfTimeBasedFairSchedulingForYARN-V1.02.pdf, 
 ProposalOfTimeExtendedFairSchedulingForYARN-V1.03.pdf


 This proposal talks about the issues of YARN fair scheduling policy, and 
 tries to solve them by YARN-3806 and the new scheduling policy called time 
 extended fair scheduling.
 Time extended fair scheduling policy is proposed to enforces fairness over 
 time among users. For example, if two users share the cluster weekly, each 
 user’s fair share is half of the cluster per week. At a particular week, if 
 the first user has used the whole cluster for first half of the week, then in 
 second half of the week, second user will always have priority to use cluster 
 resources since the first user has used up its time extended fair share of 
 the cluster already.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3808) Proposal of Time Extended Fair Scheduling for YARN


 [ 
https://issues.apache.org/jira/browse/YARN-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Shao updated YARN-3808:
---
Summary: Proposal of Time Extended Fair Scheduling for YARN  (was: Proposal 
of Time Based Fair Scheduling for YARN)

 Proposal of Time Extended Fair Scheduling for YARN
 --

 Key: YARN-3808
 URL: https://issues.apache.org/jira/browse/YARN-3808
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler, scheduler
Reporter: Wei Shao
 Attachments: ProposalOfTimeBasedFairSchedulingForYARN-V1.02.pdf


 This proposal talks about the issues of YARN fair scheduling policy, and 
 tries to solve them by YARN-3806 and the new scheduling policy called time 
 based fair scheduling.
 Time based fair scheduling policy is proposed to enforces time based fairness 
 among users. For example, if two users share the cluster weekly, each user’s 
 fair share is half of the cluster per week. At a particular week, if the 
 first user has used the whole cluster for first half of the week, then in 
 second half of the week, second user will always have priority to use cluster 
 resources since the first user has used up its fair share of the cluster 
 already.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3808) Proposal of Time Extended Fair Scheduling for YARN


 [ 
https://issues.apache.org/jira/browse/YARN-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Shao updated YARN-3808:
---
Attachment: (was: ProposalOfTimeBasedFairSchedulingForYARN-V1.02.pdf)

 Proposal of Time Extended Fair Scheduling for YARN
 --

 Key: YARN-3808
 URL: https://issues.apache.org/jira/browse/YARN-3808
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler, scheduler
Reporter: Wei Shao
 Attachments: ProposalOfTimeExtendedFairSchedulingForYARN-V1.03.pdf


 This proposal talks about the issues of YARN fair scheduling policy, and 
 tries to solve them by YARN-3806 and the new scheduling policy called time 
 extended fair scheduling.
 Time extended fair scheduling policy is proposed to enforces fairness over 
 time among users. For example, if two users share the cluster weekly, each 
 user’s fair share is half of the cluster per week. At a particular week, if 
 the first user has used the whole cluster for first half of the week, then in 
 second half of the week, second user will always have priority to use cluster 
 resources since the first user has used up its time extended fair share of 
 the cluster already.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-3808) Proposal of Time Extended Fair Scheduling for YARN