----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40103/#review108562 -----------------------------------------------------------
Ship it! Ship It! - Venkat Ranganathan On Nov. 9, 2015, 1 p.m., Balu Vellanki wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/40103/ > ----------------------------------------------------------- > > (Updated Nov. 9, 2015, 1 p.m.) > > > Review request for Falcon, Ajay Yadava, Sowmya Ramesh, and Venkat Ranganathan. > > > Bugs: FALCON-1595 > https://issues.apache.org/jira/browse/FALCON-1595 > > > Repository: falcon-git > > > Description > ------- > > In a kerberos secured cluster where the Kerberos ticket validity is one day, > Falcon server eventually lost the ability to read and write to and from HDFS. > In the logs we saw typical Kerberos-related errors like "GSSException: No > valid credentials provided (Mechanism level: Failed to find any Kerberos > tgt)". > > {code} > 2015-10-28 00:04:59,517 INFO - [LaterunHandler:] ~ Creating FS impersonating > user testUser (HadoopClientFactory:197) > 2015-10-28 00:04:59,519 WARN - [LaterunHandler:] ~ Exception encountered > while connecting to the server : javax.security.sasl.SaslException: GSS > initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Failed to find any Kerberos tgt)] (Client:680) > 2015-10-28 00:04:59,520 WARN - [LaterunHandler:] ~ Late Re-run failed for > instance sample-process:2015-10-28T03:58Z after 420000 > (AbstractRerunConsumer:84) > java.io.IOException: Failed on local exception: java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]; Host Details : local host is: > "sample.host.com/127.0.0.1"; destination host is: "sample.host.com":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773) > at org.apache.hadoop.ipc.Client.call(Client.java:1431) > at org.apache.hadoop.ipc.Client.call(Client.java:1358) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy22.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) > at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy23.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) > at > org.apache.falcon.rerun.handler.LateRerunConsumer.detectLate(LateRerunConsumer.java:108) > at > org.apache.falcon.rerun.handler.LateRerunConsumer.handleRerun(LateRerunConsumer.java:67) > at > org.apache.falcon.rerun.handler.LateRerunConsumer.handleRerun(LateRerunConsumer.java:47) > at > org.apache.falcon.rerun.handler.AbstractRerunConsumer.run(AbstractRerunConsumer.java:73) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS > initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Failed to find any Kerberos tgt)] > at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:685) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:648) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:735) > at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493) > at org.apache.hadoop.ipc.Client.call(Client.java:1397) > {code} > > > Diffs > ----- > > common/src/main/java/org/apache/falcon/hadoop/HadoopClientFactory.java > 9534ff2 > > Diff: https://reviews.apache.org/r/40103/diff/ > > > Testing > ------- > > end2end testing done on a two node secure cluster. Updated krb5.conf, > ticket_lifetime set to 1day, renew_lifetime set to 1day. Ran falcon for more > than a two days and falcon did not have issues accessing hdfs. > > > Thanks, > > Balu Vellanki > >
