[ 
https://issues.apache.org/jira/browse/HBASE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6572:
----------------------------------

    Description: 
Consider how we might enable tiered HFile storage. If HDFS has the capability, 
we could create certain files on solid state devices where they might be 
frequently accessed, especially for random reads; and others (and by default) 
on spinning media as before. We could support the move of frequently read 
HFiles from spinning media to solid state. We already have CF statistics for 
this, would only need to add requisite admin interface; could even consider an 
autotiering option. 

Dhruba Borthakur did some early work in this area and wrote up his findings: 
http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . It 
is important to note the findings but I suggest most of the recommendations are 
out of scope of this JIRA. This JIRA seeks to find an initial use case that 
produces a reasonable benefit, and serves as a testbed for further 
improvements. If I may paraphrase Dhruba's findings (any misstatements and 
errors are mine): First, the DFSClient code paths introduce significant 
latency, so the HDFS client (and presumably the DataNode, as the next 
bottleneck) will need significant work to knock that down. Need to investigate 
optimized (perhaps read-only) DFS clients, server side read and caching 
strategies. Second, RegionServers are heavily threaded and this imposes a lot 
of monitor contention and context switching cost. Need to investigate reducing 
the number of threads in a RegionServer, nonblocking IO and RPC.

  was:Consider how we might enable tiered HFile storage. If HDFS has the 
capability, we could create certain files on solid state devices where they 
might be frequently accessed, especially for random reads; and others (and by 
default) on spinning media as before. We could support the move of frequently 
read HFiles from spinning media to solid state. We already have CF statistics 
for this, would only need to add requisite admin interface; could even consider 
an autotiering option. 

    
> Tiered HFile storage
> --------------------
>
>                 Key: HBASE-6572
>                 URL: https://issues.apache.org/jira/browse/HBASE-6572
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>
> Consider how we might enable tiered HFile storage. If HDFS has the 
> capability, we could create certain files on solid state devices where they 
> might be frequently accessed, especially for random reads; and others (and by 
> default) on spinning media as before. We could support the move of frequently 
> read HFiles from spinning media to solid state. We already have CF statistics 
> for this, would only need to add requisite admin interface; could even 
> consider an autotiering option. 
> Dhruba Borthakur did some early work in this area and wrote up his findings: 
> http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . 
> It is important to note the findings but I suggest most of the 
> recommendations are out of scope of this JIRA. This JIRA seeks to find an 
> initial use case that produces a reasonable benefit, and serves as a testbed 
> for further improvements. If I may paraphrase Dhruba's findings (any 
> misstatements and errors are mine): First, the DFSClient code paths introduce 
> significant latency, so the HDFS client (and presumably the DataNode, as the 
> next bottleneck) will need significant work to knock that down. Need to 
> investigate optimized (perhaps read-only) DFS clients, server side read and 
> caching strategies. Second, RegionServers are heavily threaded and this 
> imposes a lot of monitor contention and context switching cost. Need to 
> investigate reducing the number of threads in a RegionServer, nonblocking IO 
> and RPC.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to