[ 
https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308568#comment-14308568
 ] 

Colin Patrick McCabe commented on HDFS-7694:
--------------------------------------------

bq. One question, in what cases, user needs to unbuffer instead of closing the 
stream?

Good question.  The main answer is that re-opening a stream will cause a 
getBlockLocations RPC to the NameNode.  Some applications cache a lot of open 
streams in order to avoid generating a lot of NameNode traffic.  HBase is one, 
Impala is another.  This change is a really easy way to let those applications 
save memory without generating a lot of RPC load on the NN.

> FSDataInputStream should support "unbuffer"
> -------------------------------------------
>
>                 Key: HDFS-7694
>                 URL: https://issues.apache.org/jira/browse/HDFS-7694
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7694.001.patch
>
>
> For applications that have many open HDFS (or other Hadoop filesystem) files, 
> it would be useful to have an API to clear readahead buffers and sockets.  
> This could be added to the existing APIs as an optional interface, in much 
> the same way as we added setReadahead / setDropBehind / etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to