[ 
https://issues.apache.org/jira/browse/HDFS-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028080#comment-14028080
 ] 

Daryn Sharp commented on HDFS-6222:
-----------------------------------

Good questions.  This design is the result of problems encountered by 
converting a mission critical production system to use webhdfs.  We've been 
internally running in production for months with this change on 0.23, and a 
sandbox 2.x grid.  A few of the issues: The renewer is hardcoded to assume 24h, 
which isn't a guarantee by any means.  The filesystem can go dead for up to a 
day.  Decreasing the token renewal on our QA clusters to 30s to stress token 
handling obviously didn't work either...  We've also encountered class loader 
leaks.   Filesystems would become unusable if the token expired, erroneously 
cancelled, or transient renewal failures such as during a NN restart.

# A secure client is supposed to be able to talk to an insecure server which is 
why earlier logic had this same behavior.  Regarding malformed responses, NPEs 
used to be generated, not null returns.  My earlier work trapped and converted 
the NPEs, which in this case will trigger the retry loop.  Unlike the current 
implementation, the fs will attempt to re-acquire a token even after one 
operation fails which prevents the fs from becoming unusable - 
# Yes, very unfortunate, but I only did it for backwards compatibility with 
NNs, and also cross-compatibility with DNs that don't munge the token 
exception.  I checked earlier versions and it appears to have always been this 
way.
# True.  We've become very performance conscience, but token renewal is 
infrequent if ever required by non-daemons so I consider the tiny latency worth 
the robustness.
# TokenAspect is still used by hftp or I would have happily removed it...
# I think this was covered by other tests.  I'll double check and add if 
necessary.  I'm not sure how to test swebhdfs since it requires extra 
configuration and ssl certs to function...

> Remove background token renewer from webhdfs
> --------------------------------------------
>
>                 Key: HDFS-6222
>                 URL: https://issues.apache.org/jira/browse/HDFS-6222
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-6222.branch-2.patch, HDFS-6222.branch-2.patch, 
> HDFS-6222.trunk.patch, HDFS-6222.trunk.patch
>
>
> The background token renewer is a source of problems for long-running 
> daemons.  Webhdfs should lazy fetch a new token when it receives an 
> InvalidToken exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to