[ 
https://issues.apache.org/jira/browse/HADOOP-19239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870340#comment-17870340
 ] 

Xiang Li commented on HADOOP-19239:
-----------------------------------

[[email protected]] Thanks for the comment!
{quote}I should point out that given IAM credential refresh itself fails to 
refresh properly at the end of the hour (HADOOP-19181) then worrying about 
expiry of fs instances is very much a second order issue. 
{quote}
I will review and watch HADOOP-19181 as soon as possible. Thanks for pointing 
that out!

> Enhance FileSystem.Cache to honor security token and expiration
> ---------------------------------------------------------------
>
>                 Key: HADOOP-19239
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19239
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs, fs/s3
>    Affects Versions: 3.3.4
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Major
>
> We have an online service which uses Hadoop FileSystem to load files from 
> Clould storage.
> The current cache in FileSystem is a 
> [HashMap|https://github.com/apache/hadoop/blob/4525c7e35ea22d7a6350b8af10eb8d2ff68376e7/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3635C1-L3635C62],
>  and its key honors scheme, authority (like 
> [user@host:port|https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Syntax]),
>  ugi and a unique long for its [hash 
> code|https://github.com/apache/hadoop/blob/4525c7e35ea22d7a6350b8af10eb8d2ff68376e7/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3891C1-L3894C8].
>  And among those 4 fields, only "scheme" and "authority" could be controlled 
> externally.
> That results in a wrong case like: A FileSystem entry in the cache was 
> created with schemeA + authorityA, and with read + write access, and an 
> expiration. Later, an API to get FileSystem comes still using schemeA + 
> authorityA, but with less access (maybe read only), or it already expires, 
> that FileSystem entry in the cache is hit by mistake, while no new 
> FilleSystem is created. It does not lead to a security issue, but subsequent 
> calls (maybe to read the file) will be rejected with 403 by the remote stoage.
> Our proposal is like
>  * Short term
>  ** Add a new field in FileSystem.Cache.Key to affect hashCode() and 
> equals(). This field could be specified when contructing a Key.
>  ** Add a simple expiration mechanism in FileSystem.Cache
>  *** Each cache entry is created with a expiration
>  *** When getting a FileSystem, if the cache entry is hit but already 
> expires, close it and remove it from the cache. And return a new created 
> FileSystem.
>  * Long term
>  ** Replace the internal HashMap by a more modern and full functional cache 
> framework, like [https://github.com/ben-manes/caffeine]
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to