[
https://issues.apache.org/jira/browse/HBASE-12583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225949#comment-14225949
]
stack commented on HBASE-12583:
-------------------------------
bq. The split row definitely in the region's key range but may not in the
storefile key range.
Ok. That is better.
So, if a split key is bigger or smaller than storefile, we don't want to split
the storefile; the file goes to the left or right of the split point; a split
point that is not in a storefile is fine.
bq. We want to split the index region also at c. so index child regions also
will be like a-c and c-e
They are companion regions? Can't you split them by passing in a pertinent
split key, one related to that of the primary region but adapted for the
companion region? Are you passing in 'wrong' key, the split key for primary
region?
Sorry, I don't get it. I'm a bit thick. I need to go back and read the
original secondary index implementation paper posted a good while back. It is
messing/presuming too much about hbase internals.
This stuff used to work for you but now the checks are more stringent, it
breaks you?
> Allow creating reference files even the split row not lies in the storefile
> range if required
> ---------------------------------------------------------------------------------------------
>
> Key: HBASE-12583
> URL: https://issues.apache.org/jira/browse/HBASE-12583
> Project: HBase
> Issue Type: Improvement
> Reporter: rajeshbabu
> Assignee: rajeshbabu
> Labels: Phoenix
> Fix For: 2.0.0, 0.98.9, 0.99.2
>
>
> Currently in HRegionFileSystem#splitStoreFile we are not creating reference
> files if the split row not lies in the storefile range that means one of the
> child region doesn't have any data.
> {code}
> // Check whether the split row lies in the range of the store file
> // If it is outside the range, return directly.
> if (top) {
> //check if larger than last key.
> KeyValue splitKey = KeyValueUtil.createFirstOnRow(splitRow);
> byte[] lastKey = f.createReader().getLastKey();
> // If lastKey is null means storefile is empty.
> if (lastKey == null) return null;
> if (f.getReader().getComparator().compareFlatKey(splitKey.getBuffer(),
> splitKey.getKeyOffset(), splitKey.getKeyLength(), lastKey, 0,
> lastKey.length) > 0) {
> return null;
> }
> } else {
> //check if smaller than first key
> KeyValue splitKey = KeyValueUtil.createLastOnRow(splitRow);
> byte[] firstKey = f.createReader().getFirstKey();
> // If firstKey is null means storefile is empty.
> if (firstKey == null) return null;
> if (f.getReader().getComparator().compareFlatKey(splitKey.getBuffer(),
> splitKey.getKeyOffset(), splitKey.getKeyLength(), firstKey, 0,
> firstKey.length) < 0) {
> return null;
> }
> }
> {code}
> In some cases when split row should be compared with part of rowkey(in
> composite rowkey) mainly for secondary index tables we need to create
> reference files even when split row not lies in the storefile range so that
> they can be rewritten to it's child regions by some custom half store file
> reader which compare the part of row key with split row.
> The check of comparing split row with storefile range and returning directly
> can be avoided by having special boolean attribute in table descriptor when
> it set to true. Or else we can have coprocessor hooks so that in the hooks we
> can create the references and bypass.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)