[ 
https://issues.apache.org/jira/browse/HBASE-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872774#comment-15872774
 ] 

Umesh Agashe edited comment on HBASE-14090 at 2/18/17 12:06 AM:
----------------------------------------------------------------

Sometime back, we here at Cloudera had discussion about our effort on this 
issue. We talked about our status of our efforts, findings, experiments and 
concluded with need for a new approach to solve this issue. This doc summarizes 
the discussion. Please see the link to the doc: "Discussion on new radically 
different approach to HBase FS directory layout REDO work".


was (Author: uagashe):
Sometime back, we here at Cloudera had discussion about our effort on this 
issue. We talked about our status of our efforts, findings, experiments and 
concluded with need for a new approach to solve this issue. This doc summarizes 
the discussion.

> Redo FS layout; let go of tables/regions/stores directory hierarchy in DFS
> --------------------------------------------------------------------------
>
>                 Key: HBASE-14090
>                 URL: https://issues.apache.org/jira/browse/HBASE-14090
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: stack
>            Assignee: Sean Busbey
>
> Our layout as is won't work if 1M regions; e.g. HDFS will fall over if 
> directories of hundreds of thousands of files. HBASE-13991 (Humongous Tables) 
> would address this specific directory problem only by adding subdirs under 
> table dir but there are other issues with our current layout:
>  * Our table/regions/column family 'facade' has to be maintained in two 
> locations -- in master memory and in the hdfs directory layout -- and the 
> farce needs to be kept synced or worse, the model management is split between 
> master memory and DFS layout. 'Syncing' in HDFS has us dropping constructs 
> such as 'Reference' and 'HalfHFiles' on split, 'HFileLinks' when archiving, 
> and so on. This 'tie' makes it hard to make changes.
>  * While HDFS has atomic rename, useful for fencing and for having files 
> added atomically, if the model were solely owned by hbase, there are hbase 
> primitives we could make use of -- changes in a row are atomic and 
> coprocessors -- to simplify table transactions and provide more consistent 
> views of our model to clients; file 'moves' could be a memory operation only 
> rather than an HDFS call; sharing files between tables/snapshots and when it 
> is safe to remove them would be simplified if one owner only; and so on.
> This is an umbrella blue-sky issue to discuss what a new layout would look 
> like and how we might get there. I'll follow up with some sketches of what 
> new layout could look like that come of some chats a few of us have been 
> having. We are also under the 'delusion' that move to a new layout could be 
> done as part of a rolling upgrade and that the amount of work involved is not 
> gargantuan.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to