[ 
https://issues.apache.org/jira/browse/HBASE-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602652#comment-13602652
 ] 

Sergey Shelukhin commented on HBASE-8109:
-----------------------------------------

We can indeed scale it down from managing blocks to being block aware and 
reusing blocks (manipulating blocks inside files, knowing the boundaries and 
block reuse/refcounting would still be necessary but presumably handled in 
HDFS).
                
> HBase can manage blocks instead of files in HDFS
> ------------------------------------------------
>
>                 Key: HBASE-8109
>                 URL: https://issues.apache.org/jira/browse/HBASE-8109
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Sergey Shelukhin
>
> Prompted by previous non-Hadoop experience and some dev list discussions, and 
> after talking to some HDFS people about blocks.
> HBase could improve a lot by managing HDFS blocks instead of files, and 
> reusing the blocks among other things. Some areas that could improve are 
> splits, compactions, management of large blobs, locality enforcement.
> I was told that block APIs in Hadoop 2 are well-isolated, but not exposed 
> yet. They can easily be exposed, and as one of the first potential users we 
> could get to help shape them. Two areas that from my limited understanding is 
> currently fuzzy are namespaces for blocks, and ref-counting.
> We should come up with list of initial scenarios to figure out what we need 
> from block API (locality, detecting/enforcing block boundary/variable size 
> blocks, reusing one block, ...).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to