[
https://issues.apache.org/jira/browse/HBASE-13159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351071#comment-14351071
]
Enis Soztutar commented on HBASE-13159:
---------------------------------------
Big +1. Was thinking along the same lines (excluding transformations). HFile
link implements soft links, and Reference files implement soft links with
top/bottom limit. We can actually benefit from another related concept, where
an HFile contains information for multiple regions, and can be referred from
multiple regions (with region name + boundary). This is pretty important for
distributed log splitting, where we do not need to create so many small files
per WAL file (we end up creating regions x WAL files many files). I believe in
the original paper, BigTable achieves something like this with softlinks
implemented via META (the regions files are what is there in the meta, not what
is there in the file system). A range on the reference file will allow the
region to be splittable after another split, but before compaction. It also
allows the region to be split into more than two pieces.
For the transformation, I think even local index can make use of it. If we can
make persist the index data without the region start key prefix, we can apply
an on-demand transformation for adding the prefix to the cells, so that after
local index region split, the data does not have to be rewritten.
> Consider RangeReferenceFiles with transformations
> -------------------------------------------------
>
> Key: HBASE-13159
> URL: https://issues.apache.org/jira/browse/HBASE-13159
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Lars Hofhansl
>
> Currently we have References used by HalfStoreReaders and HFileLinks.
> For various use cases we have here we have need for a RangeReferences with
> simple transformation of the keys.
> That would allow us to map HFiles between regions or even tables without
> copying any data.
> We can probably combine HalfStores, HFileLinks, and RangeReferences into a
> single concept:
> * RangeReference = arbitrary start and stop row, arbitrary key transformation
> * HFileLink = start and stop keys set to the linked file's start/stop key,
> transformation = identity
> * (HalfStore) References = start/stop key set according to top or bottom
> reference, transformation = identity
> Note this is a *brainstorming* issue. :)
> (Could start with just references with arbitrary start/stop keys, and do
> transformations later)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)