[
https://issues.apache.org/jira/browse/HBASE-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13602605#comment-13602605
]
Matteo Bertozzi commented on HBASE-8109:
----------------------------------------
When I started HBASE-7806 the idea was that one (look at the pdf "future"
section).
the current problem in the code is that we rely in too many places on file
names and path (HBASE-7806 should solve that).
Once we have isolated the fs calls, we can switch to a system table tracking
blocks. (that is basically a sort of Name Node)
The main advantage of using blocks is a better compaction. Instead of rewriting
the whole file, we can just rewrite few blocks. Also snapshot will have a
better block sharing (now they share files).
> HBase can manage blocks instead of files in HDFS
> ------------------------------------------------
>
> Key: HBASE-8109
> URL: https://issues.apache.org/jira/browse/HBASE-8109
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Sergey Shelukhin
>
> Prompted by previous non-Hadoop experience and some dev list discussions, and
> after talking to some HDFS people about blocks.
> HBase could improve a lot by managing HDFS blocks instead of files, and
> reusing the blocks among other things. Some areas that could improve are
> splits, compactions, management of large blobs, locality enforcement.
> I was told that block APIs in Hadoop 2 are well-isolated, but not exposed
> yet. They can easily be exposed, and as one of the first potential users we
> could get to help shape them. Two areas that from my limited understanding is
> currently fuzzy are namespaces for blocks, and ref-counting.
> We should come up with list of initial scenarios to figure out what we need
> from block API (locality, detecting/enforcing block boundary/variable size
> blocks, reusing one block, ...).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira