[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594783#comment-14594783
 ] 

Hadoop QA commented on YARN-3779:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 56s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 46s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | mapreduce tests |   5m 53s | Tests passed in 
hadoop-mapreduce-client-hs. |
| | |  43m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12740836/YARN-3779.03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 055cd5a |
| hadoop-mapreduce-client-hs test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8301/artifact/patchprocess/testrun_hadoop-mapreduce-client-hs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8301/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8301/console |


This message was automatically generated.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> YARN-3779.03.patch, log_aggr_deletion_on_refresh_error.log, 
> log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Met

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594551#comment-14594551
 ] 

Varun Saxena commented on YARN-3779:


Added a patch and submitted it, fixing both cases. This JIRA should move to 
MAPREDUCE. But not moving it because not sure if Jenkins will be able to post 
results of the submitted patch then

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> YARN-3779.03.patch, log_aggr_deletion_on_refresh_error.log, 
> log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594423#comment-14594423
 ] 

Varun Saxena commented on YARN-3779:


[~vinodkv], thats correct.
So do you want me to raise another JIRA for that ? Or do it as part of this one 
only ?

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> 

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-19 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593855#comment-14593855
 ] 

Vinod Kumar Vavilapalli commented on YARN-3779:
---

[~varun_saxena], I agree with Zhijie here. We may be lucky for now in case of 
refreshJobRention call depending on how we spawn threads. To future proof 
ourselves, I think the right behaviour is to simply depend on loginUser in both 
the cases.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.had

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593738#comment-14593738
 ] 

Varun Saxena commented on YARN-3779:


Will update the patch as per suggestions tomorrow morning.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> at org.apache.hadoop.ipc.Client.getConnection(Client.jav

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593386#comment-14593386
 ] 

Varun Saxena commented on YARN-3779:


[~vinodkv], [~zjshen],
I had checked {{refreshJobRetentionSettings}} too when this issue came. And 
issue didn't happen there.
This issue comes in the case of refreshLogRetentionSettings as a new thread is 
invoked(upon cancellation of {{Timer}}) which creates a new DFS Client to 
connect to namenode.

In case of refresh Job retention settings, we use a 
{{ScheduledThreadPoolExecutor}} instead hence a new thread is not spawned on 
refresh. We simply cancel the {{ScheduledFuture}}. And in this case, issue 
doesn't happen.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subj

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-18 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592737#comment-14592737
 ] 

Zhijie Shen commented on YARN-3779:
---

Thanks for helping the issue, Vinod! It sounds the right cause of this issue. I 
checked refreshJobRetentionSettings, which should have the same problem because 
of accessing HDFS too.

I'm thinking it is more clear to fix the problem inside HSAdminServer. We still 
need to cache the correct loginUGI. Then, inside HSAdminServer, once we 
verified user's permission on a certain command, we use loginUGI to complete 
the following process instead of the remote user. Thoughts?

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apac

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-15 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14587165#comment-14587165
 ] 

Zhijie Shen commented on YARN-3779:
---

[~varun_saxena], do you know why ugi is still the same, but kerberos 
authentication gets failed?

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> at org.apache.hadoo

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581927#comment-14581927
 ] 

Varun Saxena commented on YARN-3779:


By updated I mean attached.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
> at org.apache

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581926#comment-14581926
 ] 

Varun Saxena commented on YARN-3779:


[~xgong], also updated complete logs, one for demonstrating the problem and 
other demonstrating the fix(after patch above has been applied). Moreover, this 
issue can be fixed if I use {{ScheduledThreadPoolExecutor}} with one 
thread(which is anyways recommended for use over Timer) but as that fix wasn't 
directly related to the issue, hence didnt submit that as a solution.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch, 
> log_aggr_deletion_on_refresh_error.log, log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.a

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581712#comment-14581712
 ] 

Varun Saxena commented on YARN-3779:


[~xgong], after applying the patch, debug log on refreshing log retention 
setting is something as under. I will update both success and error logs too, a 
little while later.

{noformat}
2015-06-11 14:49:56,973 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 
for port 10033: responding to null from 10.19.92.82:30295 Call#-33 Retry#-1 
Wrote 22 bytes.
2015-06-11 14:49:56,981 DEBUG org.apache.hadoop.ipc.Server:  got #-3
2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server: Successfully 
authorized userInfo {
  effectiveUser: "hdfs/hua...@hadoop.com"
}
protocol: "org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol"

2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server:  got #0
2015-06-11 14:49:57,015 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 
0 on 10033: 
org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings
 from 10.19.92.82:30295 Call#0 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER
2015-06-11 14:49:57,016 DEBUG org.apache.hadoop.security.UserGroupInformation: 
PrivilegedAction as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
2015-06-11 14:49:57,027 INFO 
org.apache.hadoop.mapreduce.v2.hs.server.HSAdminServer: HS Admin: 
refreshLogRetentionSettings invoked by user hdfs
2015-06-11 14:49:57,027 DEBUG org.apache.hadoop.ipc.Client: stopping client 
from cache: org.apache.hadoop.ipc.Client@2dfaea86
2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.security.UserGroupInformation: 
PrivilegedAction as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136)
2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.YarnRPC: Creating 
YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC: 
Creating a HadoopYarnProtoRpc proxy for protocol interface 
org.apache.hadoop.yarn.api.ApplicationClientProtocol
2015-06-11 14:49:57,080 DEBUG org.apache.hadoop.ipc.Client: getting client out 
of cache: org.apache.hadoop.ipc.Client@2dfaea86
2015-06-11 14:49:57,081 INFO 
org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated 
log deletion started.
2015-06-11 14:49:57,081 INFO org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: 
USER=hdfs IP=10.19.92.82  OPERATION=refreshLogRetentionSettings   
TARGET=HSAdminServerRESULT=SUCCESS
2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.security.UserGroupInformation: 
PrivilegedAction as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
from:org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:83)
2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.ipc.Server: Served: 
refreshLogRetentionSettings queueTime= 11 procesingTime= 55
2015-06-11 14:49:57,082 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 
0 on 10033: responding to 
org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings
 from 10.19.92.82:30295 Call#0 Retry#0
2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 
0 on 10033: responding to 
org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings
 from 10.19.92.82:30295 Call#0 Retry#0 Wrote 32 bytes.
2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(889891977) connection to /10.19.92.82:65110 from hdfs/hua...@hadoop.com 
sending #5
2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.Client: IPC Client 
(889891977) connection to /10.19.92.82:65110 from hdfs/hua...@hadoop.com got 
value #5
2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
getListing took 1ms
2015-06-11 14:49:57,085 INFO 
org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated 
log deletion finished.
{noformat}

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as p

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-11 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581701#comment-14581701
 ] 

Varun Saxena commented on YARN-3779:


Sure. Will share DEBUG logs for that too.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
> at org.apache.hadoop.ipc.Client.call(Client.java:1381)
> ... 21 more
> C

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581430#comment-14581430
 ] 

Xuan Gong commented on YARN-3779:
-

[~varun_saxena] Thanks for the logs. Could you apply the patch and print the 
ugi ?

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
> at org.apache.hadoop.ipc.Client.call(Client.

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-09 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579194#comment-14579194
 ] 

Varun Saxena commented on YARN-3779:


Sorry the correct sequence of error logs is as under. After first GSSException, 
client i.e. historyserver keeps on retrying before giving up.

{noformat}
2015-06-05 22:49:24,541 INFO Timer-3  
org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated 
log deletion started.
2015-06-05 22:49:24,541 INFO IPC Server handler 0 on 10033  
org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs  IP=10.19.92.82  
OPERATION=refreshLogRetentionSettings   TARGET=HSAdminServerRESULT=SUCCESS
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.use.legacy.blockreader.local = false
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.read.shortcircuit = false
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.domain.socket.data.traffic = false
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.domain.socket.path = 
2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.DFSClient: Sets 
dfs.client.block.write.replace-datanode-on-failure.replication to 0
2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.HAUtil: No HA 
service delegation token found for logical URI hdfs://hacluster
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.use.legacy.blockreader.local = false
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.read.shortcircuit = false
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.domain.socket.data.traffic = false
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.domain.socket.path = 
2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.io.retry.RetryUtils: 
multipleLinearRandomRetry = null
2015-06-05 22:49:24,553 DEBUG Timer-3  org.apache.hadoop.ipc.Client: getting 
client out of cache: org.apache.hadoop.ipc.Client@28194a50
2015-06-05 22:49:24,554 DEBUG Timer-3  
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: 
DataTransferProtocol using SaslPropertiesResolver, configured QOP 
dfs.data.transfer.protection = authentication, configured class 
dfs.data.transfer.saslproperties.resolver.class = class 
org.apache.hadoop.security.SaslPropertiesResolver
2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.ipc.Client: The ping 
interval is 6 ms.
2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.ipc.Client: Connecting 
to /10.19.92.88:65110
2015-06-05 22:49:24,555 DEBUG Timer-3  
org.apache.hadoop.security.UserGroupInformation: PrivilegedAction 
as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749)
2015-06-05 22:49:24,557 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB 
info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, 
serverPrincipal=dfs.namenode.kerberos.principal)
2015-06-05 22:49:24,557 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: getting serverKey: 
dfs.namenode.kerberos.principal conf value: hdfs/hua...@hadoop.com principal: 
hdfs/hua...@hadoop.com
2015-06-05 22:49:24,557 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name 
for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is 
hdfs/hua...@hadoop.com
2015-06-05 22:49:24,557 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS)  
client to authenticate to service at huawei
2015-06-05 22:49:24,558 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for 
protocol ClientNamenodeProtocolPB
2015-06-05 22:49:24,559 DEBUG Timer-3  
org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException 
as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
2015-06-05 22:49:24,560 DEBUG Timer-3  
org.apache.hadoop.security.UserGroupInformation: PrivilegedAction 
as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
from:org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:668)
2015-06-05 22:49:24,561 WARN Timer-3  org.apache.hadoop.ipc.Client: Exception 
encountered while connecting to the server : javax.security.sasl.Sasl

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-09 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579154#comment-14579154
 ] 

Varun Saxena commented on YARN-3779:


[~zjshen], GSSException was thrown while calling {{evaluateChallenge}} in 
SaslRpcClient.java
I had printed the DEBUG logs when I tested this(at the history server side). It 
seems correct UGI is taken but still error comes.

Below are the logs when error occurs after refresh of log retention settings.
{noformat}
2015-06-05 22:49:24,541 INFO IPC Server handler 0 on 10033  
org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs  IP=10.19.92.82  
OPERATION=refreshLogRetentionSettings   TARGET=HSAdminServerRESULT=SUCCESS
...
2015-06-05 22:50:04,541 INFO Timer-3  
org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated 
log deletion started.
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.use.legacy.blockreader.local = false
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.read.shortcircuit = false
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.domain.socket.data.traffic = false
2015-06-05 22:49:24,550 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.domain.socket.path = 
2015-06-05 22:49:24,550 DEBUG Timer-3  org.apache.hadoop.hdfs.DFSClient: Sets 
dfs.client.block.write.replace-datanode-on-failure.replication to 0
2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.hdfs.HAUtil: No HA 
service delegation token found for logical URI hdfs://hacluster
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.use.legacy.blockreader.local = false
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.read.shortcircuit = false
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.client.domain.socket.data.traffic = false
2015-06-05 22:49:24,552 DEBUG Timer-3  
org.apache.hadoop.hdfs.client.impl.DfsClientConf$ShortCircuitConf: 
dfs.domain.socket.path = 
2015-06-05 22:49:24,552 DEBUG Timer-3  org.apache.hadoop.io.retry.RetryUtils: 
multipleLinearRandomRetry = null
2015-06-05 22:49:24,553 DEBUG Timer-3  org.apache.hadoop.ipc.Client: getting 
client out of cache: org.apache.hadoop.ipc.Client@28194a50
2015-06-05 22:49:24,554 DEBUG Timer-3  
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil: 
DataTransferProtocol using SaslPropertiesResolver, configured QOP 
dfs.data.transfer.protection = authentication, configured class 
dfs.data.transfer.saslproperties.resolver.class = class 
org.apache.hadoop.security.SaslPropertiesResolver
2015-06-05 22:49:24,554 DEBUG Timer-3  org.apache.hadoop.ipc.Client: The ping 
interval is 6 ms.
2015-06-05 22:50:04,542 DEBUG Timer-3  org.apache.hadoop.ipc.Client: Connecting 
to host-10-19-92-88/10.19.92.88:65110
2015-06-05 22:50:04,543 DEBUG Timer-3  
org.apache.hadoop.security.UserGroupInformation: PrivilegedAction 
as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
from:org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:749)
2015-06-05 22:50:04,544 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: Get kerberos info proto:interface 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB 
info:@org.apache.hadoop.security.KerberosInfo(clientPrincipal=, 
serverPrincipal=dfs.namenode.kerberos.principal)
2015-06-05 22:50:04,545 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: getting serverKey: 
dfs.namenode.kerberos.principal conf value: hdfs/hua...@hadoop.com principal: 
hdfs/hua...@hadoop.com
2015-06-05 22:50:04,545 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: RPC Server's Kerberos principal name 
for protocol=org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolPB is 
hdfs/hua...@hadoop.com
2015-06-05 22:50:04,545 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: Creating SASL GSSAPI(KERBEROS)  
client to authenticate to service at huawei
2015-06-05 22:50:04,546 DEBUG Timer-3  
org.apache.hadoop.security.SaslRpcClient: Use KERBEROS authentication for 
protocol ClientNamenodeProtocolPB
2015-06-05 22:50:04,547 DEBUG Timer-3  
org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException 
as:hdfs/hua...@hadoop.com (auth:KERBEROS) 
cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
com.sun.security.sasl.gsskerb.Gs

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577709#comment-14577709
 ] 

Zhijie Shen commented on YARN-3779:
---

No, I didn't simulate the problem. Just have a quick glance at the code. Log 
retention refresh will reschedule the deletion task, but this is done in the 
rpc call by the request user. So I'm not wondering if this changes the ug of 
the following deletion task. Can you try to print the ugi? Then, we can see 
what is changed.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connecti

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-08 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577570#comment-14577570
 ] 

Varun Saxena commented on YARN-3779:


[~zjshen], thanks for looking at this.
Its the same user which is used for both starting the history server and for 
executing the refresh command.
Timer will create a new thread on refresh and from then on, problem occurs.

There is no problem if I use a ScheduledThreadPoolExecutor(with 1 thread) 
instead as that doesn't spawn a new thread.
So it seems the new thread doesn't take the correct UGI.

Are you able to simulate the issue ?
I hope there is no issue in the way Kerberos has been set up in my cluster.

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.do

[jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster

2015-06-08 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577527#comment-14577527
 ] 

Zhijie Shen commented on YARN-3779:
---

So the problem is after refreshing, the deletion task is scheduled and executed 
by the ugi of who executes the refreshing command, right?

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings 
> in secure cluster
> --
>
> Key: YARN-3779
> URL: https://issues.apache.org/jira/browse/YARN-3779
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
> Environment: mrV2, secure mode
>Reporter: Zhang Wei
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: YARN-3779.01.patch, YARN-3779.02.patch
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted 
> after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure 
> cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh 
> the configuration value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is 
> enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this 
> deletion attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; 
> destination host is: "vm-33":25000; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1414)
> at org.apache.hadoop.ipc.Client.call(Client.java:1363)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy9.getListing(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy10.getListing(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
> at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
> at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
> at java.util.TimerThread.mainLoop(Timer.java:555)
> at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS 
> initiate failed [Caused by GSSException: No valid credentials provided 
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
> at 
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1