Balu Vellanki created FALCON-1595:
-------------------------------------
Summary: Falcon server loses ability to communicate with HDFS over
time
Key: FALCON-1595
URL: https://issues.apache.org/jira/browse/FALCON-1595
Project: Falcon
Issue Type: Bug
Affects Versions: 0.8
Reporter: Balu Vellanki
Assignee: Balu Vellanki
In a kerberos secured cluster where the Kerberos ticket validity is one day,
Falcon server eventually lost the ability to read and write to and from HDFS.
In the logs we saw typical Kerberos-related errors like "GSSException: No valid
credentials provided (Mechanism level: Failed to find any Kerberos tgt)".
{code}
2015-10-28 00:04:59,517 INFO - [LaterunHandler:] ~ Creating FS impersonating
user testUser (HadoopClientFactory:197)
2015-10-28 00:04:59,519 WARN - [LaterunHandler:] ~ Exception encountered while
connecting to the server : javax.security.sasl.SaslException: GSS initiate
failed [Caused by GSSException: No valid credentials provided (Mechanism level:
Failed to find any Kerberos tgt)] (Client:680)
2015-10-28 00:04:59,520 WARN - [LaterunHandler:] ~ Late Re-run failed for
instance sample-process:2015-10-28T03:58Z after 420000
(AbstractRerunConsumer:84)
java.io.IOException: Failed on local exception: java.io.IOException:
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)]; Host Details : local host is: "sample.host.com/127.0.0.1"; destination
host is: "sample.host.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
at org.apache.hadoop.ipc.Client.call(Client.java:1431)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy23.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at
org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at
org.apache.falcon.rerun.handler.LateRerunConsumer.detectLate(LateRerunConsumer.java:108)
at
org.apache.falcon.rerun.handler.LateRerunConsumer.handleRerun(LateRerunConsumer.java:67)
at
org.apache.falcon.rerun.handler.LateRerunConsumer.handleRerun(LateRerunConsumer.java:47)
at
org.apache.falcon.rerun.handler.AbstractRerunConsumer.run(AbstractRerunConsumer.java:73)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate
failed [Caused by GSSException: No valid credentials provided (Mechanism level:
Failed to find any Kerberos tgt)]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:685)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:648)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:735)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493)
at org.apache.hadoop.ipc.Client.call(Client.java:1397)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)