[ https://issues.apache.org/jira/browse/HDFS-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028080#comment-14028080 ]
Daryn Sharp commented on HDFS-6222: ----------------------------------- Good questions. This design is the result of problems encountered by converting a mission critical production system to use webhdfs. We've been internally running in production for months with this change on 0.23, and a sandbox 2.x grid. A few of the issues: The renewer is hardcoded to assume 24h, which isn't a guarantee by any means. The filesystem can go dead for up to a day. Decreasing the token renewal on our QA clusters to 30s to stress token handling obviously didn't work either... We've also encountered class loader leaks. Filesystems would become unusable if the token expired, erroneously cancelled, or transient renewal failures such as during a NN restart. # A secure client is supposed to be able to talk to an insecure server which is why earlier logic had this same behavior. Regarding malformed responses, NPEs used to be generated, not null returns. My earlier work trapped and converted the NPEs, which in this case will trigger the retry loop. Unlike the current implementation, the fs will attempt to re-acquire a token even after one operation fails which prevents the fs from becoming unusable - # Yes, very unfortunate, but I only did it for backwards compatibility with NNs, and also cross-compatibility with DNs that don't munge the token exception. I checked earlier versions and it appears to have always been this way. # True. We've become very performance conscience, but token renewal is infrequent if ever required by non-daemons so I consider the tiny latency worth the robustness. # TokenAspect is still used by hftp or I would have happily removed it... # I think this was covered by other tests. I'll double check and add if necessary. I'm not sure how to test swebhdfs since it requires extra configuration and ssl certs to function... > Remove background token renewer from webhdfs > -------------------------------------------- > > Key: HDFS-6222 > URL: https://issues.apache.org/jira/browse/HDFS-6222 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs > Affects Versions: 2.0.0-alpha, 3.0.0 > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Attachments: HDFS-6222.branch-2.patch, HDFS-6222.branch-2.patch, > HDFS-6222.trunk.patch, HDFS-6222.trunk.patch > > > The background token renewer is a source of problems for long-running > daemons. Webhdfs should lazy fetch a new token when it receives an > InvalidToken exception. -- This message was sent by Atlassian JIRA (v6.2#6252)