[ 
https://issues.apache.org/jira/browse/HDFS-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660348#comment-13660348
 ] 

stack commented on HDFS-4817:
-----------------------------

bq. The problem is that reads and writes by the client don't necessarily 
translate 1:1 to operations done with the DataNode.

Makes sense.

HBase generally grabs the data in 64k chunks (it could be down to the 32k/16k 
but not usually) but there are a few distinct times when we know we are going 
into a long read mode (long scan, compactions).

Currently we open the file on startup and keep it open so we don't have to keep 
going back to the NN when we want to do a bit of i/o.  The file is shared by 
all types of access; e.g. random reads could be going on -- 64k preads usually 
-- while maybe another thread wants to do a medium or long scan (non-pread, 
unless we find that there is concurrent access and then we'll switch to pread 
so the file can be shared better).  Yet another reader might be coming in to do 
a full file scan to do a compaction, etc.

In the past we've talked of doing a new background file open if we know it is 
going to be used for a long scan so we don't disturb the state/caching of the 
foreground file.

Thanks Colin


                
> make HDFS advisory caching configurable on a per-file basis
> -----------------------------------------------------------
>
>                 Key: HDFS-4817
>                 URL: https://issues.apache.org/jira/browse/HDFS-4817
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-4817.001.patch
>
>
> HADOOP-7753 and related JIRAs introduced some performance optimizations for 
> the DataNode.  One of them was readahead.  When readahead is enabled, the 
> DataNode starts reading the next bytes it thinks it will need in the block 
> file, before the client requests them.  This helps hide the latency of 
> rotational media and send larger reads down to the device.  Another 
> optimization was "drop-behind."  Using this optimization, we could remove 
> files from the Linux page cache after they were no longer needed.
> Using {{dfs.datanode.drop.cache.behind.writes}} and 
> {{dfs.datanode.drop.cache.behind.reads}} can improve performance  
> substantially on many MapReduce jobs.  In our internal benchmarks, we have 
> seen speedups of 40% on certain workloads.  The reason is because if we know 
> the block data will not be read again any time soon, keeping it out of memory 
> allows more memory to be used by the other processes on the system.  See 
> HADOOP-7714 for more benchmarks.
> We would like to turn on these configurations on a per-file or per-client 
> basis, rather than on the DataNode as a whole.  This will allow more users to 
> actually make use of them.  It would also be good to add unit tests for the 
> drop-cache code path, to ensure that it is functioning as we expect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to