[ 
https://issues.apache.org/jira/browse/ACCUMULO-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704723#comment-13704723
 ] 

Josh Elser commented on ACCUMULO-884:
-------------------------------------

bq. In my experiments on these solid state drives, enabling short-circuit reads 
more than doubled my read throughput! (TP measured in ops/s in a YCSB-derived 
read-only workload test.) 

If I remember correctly, I read somewhere that you won't see any benefit from 
shortcircuit reads until you actually get to hadoop-2 (maybe 0.23 too?). I'll 
see if I can find that information again...

Interesting that you saw such a speedup. How influenced do you think your 
benchmark is by the YCSB workload itself? 

I did a bunch of testing early this year on spinning-disks with these 
parameters on 0.20 baseline and actually saw no performance gain trying to use 
the shortcircuit. I think I was also tweaking disabling checksums on local 
datanode reads. I think I had an even part read+write workload. JD had some 
info on HDFS-2246 but I'll leave it up to you to come to your own conclusions.

Out of curiosity, in your read-only test, did you warm the Accumulo or OS 
caches before the test (or conversely, ensure they were cold)?
                
> Take advantage of short circuit read for local files
> ----------------------------------------------------
>
>                 Key: ACCUMULO-884
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-884
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: Billie Rinaldi
>            Assignee: Keith Turner
>
> This is a new feature in hadoop 1.0.x and some versions of 0.22 and 0.23.  It 
> allows a client to read directly from disk instead of through a DataNode when 
> the data is stored locally.  Enabling it involves setting two configuration 
> parameters, the first in hdfs-site.xml and the second in accumulo-site.xml.  
> We should make sure this works with Accumulo and recommend it in the 
> documentation.
> - dfs.block.local-path-access.user is the key in datanode configuration to 
> specify the user allowed to do short circuit read.
> - dfs.client.read.shortcircuit is the key to enable short circuit read at the 
> client side configuration.
> See HDFS-2246 and http://hbase.apache.org/book/perf.hdfs.configs.html for 
> more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to