[ 
https://issues.apache.org/jira/browse/NIFI-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623045#comment-16623045
 ] 

ASF GitHub Bot commented on NIFI-5557:
--------------------------------------

Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2971#discussion_r219380028
  
    --- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/PutHDFS.java
 ---
    @@ -266,6 +271,16 @@ public Object run() {
                                 throw new 
IOException(configuredRootDirPath.toString() + " could not be created");
                             }
                             changeOwner(context, hdfs, configuredRootDirPath, 
flowFile);
    +                    } catch (IOException e) {
    +                      boolean tgtExpired = hasCause(e, GSSException.class, 
gsse -> GSSException.NO_CRED == gsse.getMajor());
    +                      if (tgtExpired) {
    +                        getLogger().error(String.format("An error occured 
while connecting to HDFS. Rolling back session, and penalizing flow file %s",
    --- End diff --
    
    The exception be logged here, in addition to the flowfile UUID.  It might 
be useful to have the stack trace and exception class available in the log, and 
we shouldn't suppress/omit the actual GSSException from the logging.
    
    It might also be a good idea to log this at the "warn" level, so that the 
user can choose to not have these show as bulletins on the processor in the UI. 
 Since the flowfile is being rolled back, and hadoop-client will implicitly 
acquire a new ticket, I don't think this should show as an error.  @mcgilman, 
@bbende, do either of you have a preference here?


> PutHDFS "GSSException: No valid credentials provided" when krb ticket expires
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-5557
>                 URL: https://issues.apache.org/jira/browse/NIFI-5557
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.5.0
>            Reporter: Endre Kovacs
>            Assignee: Endre Kovacs
>            Priority: Major
>
> when using *PutHDFS* processor in a kerberized environment, with a flow 
> "traffic" which approximately matches or less frequent then the lifetime of 
> the ticket of the principal, we see this in the log:
> {code:java}
> INFO [Timer-Driven Process Thread-4] o.a.h.io.retry.RetryInvocationHandler 
> Exception while invoking getFileInfo of class 
> ClientNamenodeProtocolTranslatorPB over host2/ip2:8020 after 13 fail over 
> attempts. Trying to fail over immediately.
> java.io.IOException: Failed on local exception: java.io.IOException: Couldn't 
> setup connection for princi...@example.com to host2.example.com/ip2:8020; 
> Host Details : local host is: "host1.example.com/ip1"; destination host is: 
> "host2.example.com":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
> at org.apache.hadoop.ipc.Client.call(Client.java:1479)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy134.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor344.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy135.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
> at org.apache.nifi.processors.hadoop.PutHDFS$1.run(PutHDFS.java:254)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:360)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1678)
> at org.apache.nifi.processors.hadoop.PutHDFS.onTrigger(PutHDFS.java:222)
> {code}
> and the flowfile is routed to failure relationship.
> *To reproduce:*
> Create a principal in your KDC with two minutes ticket lifetime,
> and set up a similar flow:
> {code:java}
> GetFile => putHDFS ----- success----- -> logAttributes
>                     \
>                      fail
>                        \
>                      -> logAttributes
> {code}
>  copy a file to the input directory of the getFile processor. If the influx 
> of the flowfile is much more frequent, then the expiration time of the ticket:
> {code:java}
> watch -n 5 "cp book.txt /path/to/input"
> {code}
> then the flow will successfully run without issue.
> If we adjust this, to:
> {code:java}
> watch -n 121 "cp book.txt /path/to/input"
> {code}
> then we will observe this issue.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to