[ 
https://issues.apache.org/jira/browse/HADOOP-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295404#comment-14295404
 ] 

Chris Nauroth commented on HADOOP-11509:
----------------------------------------

Sometimes you can work around these problems by configuring {{fs.<file system 
scheme>.impl.disable.cache}} to {{true}} or using {{FileSystem#newInstance}} to 
guarantee a certain spot in the code gets a distinct instance that won't be 
pulled from the cache later by another part of the code.  However, this has its 
own problems.  Disabling the cache can cause a performance problem.  If using 
{{FileSystem#newInstance}}, then it's very important that the owner of the 
instance eventually call {{close}} to avoid bloating memory over time.  In the 
case where the {{FileSystem}} instance creation happens indirectly, these 
workarounds might not be viable, because you might not want to change code 
several layers below just to accommodate a new patch.

Issues like this have made the {{FileSystem}} cache a frequent source of 
confusion.  This might warrant a redesign at some point.  I'd prefer that a lot 
of the implicit behavior around use of {{Configuration}} and 
{{UserGroupInformation}} were made explicit to the caller in the API.  I also 
think we'd benefit from full reference counting instead of the current "best 
effort" caching where an unrelated thread could call {{close}} and trigger a 
cache eviction.  I believe doing this would require changing the contract so 
that callers must always call {{close}}.  Unfortunately, that would be 
backwards-incompatible.  There is a ton of existing code all over the ecosystem 
that doesn't bother calling {{close}}.

> change parsing sequence in GenericOptionsParser to parse -D parameters first
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-11509
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11509
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>             Fix For: 2.7.0
>
>         Attachments: HADOOP-11509.1.patch, HADOOP-11509.2.patch
>
>
> In GenericOptionsParser, we need to parse -D parameter first. In that case, 
> the user input parameter (through -D) can be set into configuration object 
> earlier and used to process other parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to