[jira] [Commented] (HBASE-6572) Tiered HFile storage

Yu Li (JIRA) Sun, 22 Apr 2018 23:10:47 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447610#comment-16447610
 ]


Yu Li commented on HBASE-6572:
------------------------------

Hello all, since all sub tasks have completed and both 1.5 (after HBASE-19858) 
and 2.0 would support this, shall we add some document in our ref-guide about 
HSM? I'm up to write the doc if you all think it's time for us to officially 
announce supporting HSM, thanks. [~apurtell] [~stack]

Asking simply because see [some question|https://s.apache.org/yw5G] around this 
recently in our user list and feel we should have some explicit doc for our 
users :-)

> Tiered HFile storage
> --------------------
>
>                 Key: HBASE-6572
>                 URL: https://issues.apache.org/jira/browse/HBASE-6572
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Andrew Purtell
>            Priority: Major
>
> Consider how we might enable tiered HFile storage. If HDFS has the 
> capability, we could create certain files on solid state devices where they 
> might be frequently accessed, especially for random reads; and others (and by 
> default) on spinning media as before. We could support the move of frequently 
> read HFiles from spinning media to solid state. We already have CF statistics 
> for this, would only need to add requisite admin interface; could even 
> consider an autotiering option. 
> Dhruba Borthakur did some early work in this area and wrote up his findings: 
> http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html . 
> It is important to note the findings but I suggest most of the 
> recommendations are out of scope of this JIRA. This JIRA seeks to find an 
> initial use case that produces a reasonable benefit, and serves as a testbed 
> for further improvements. If I may paraphrase Dhruba's findings (any 
> misstatements and errors are mine): First, the DFSClient code paths introduce 
> significant latency, so the HDFS client (and presumably the DataNode, as the 
> next bottleneck) will need significant work to knock that down. Need to 
> investigate optimized (perhaps read-only) DFS clients, server side read and 
> caching strategies. Second, RegionServers are heavily threaded and this 
> imposes a lot of monitor contention and context switching cost. Need to 
> investigate reducing the number of threads in a RegionServer, nonblocking IO 
> and RPC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-6572) Tiered HFile storage

Reply via email to