[
https://issues.apache.org/jira/browse/HBASE-12716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258740#comment-14258740
]
Weichen Ye commented on HBASE-12716:
------------------------------------
Hi, [[email protected]]
The code for the bug is in the function iterateOnSplits() in
org.apache.hadoop.hbase.util.Bytes in hbase-common. The original code is
if(diffBI.compareTo(splitsBI) < 0) {
return null;
}
Here in the code , the "diffBI" is the diff bewteen start key and end key.
"splitBI" is the number of pieces in the result. For example, if we want to
split the region ["aaa","aab"] into 2 pieces, the "diffBI" for "aaa" and "aab"
is 1, the "splitBI" is 2. Because diffBI < splitBI, here this function returns
null. This is the reason for the NullPointerException.
I just upload a new patch to fix this bug. Use an additional byte to find the
split point between start key and end key.
Would you please take a look at this patch?
> A bug in RegionSplitter.UniformSplit algorithm
> ----------------------------------------------
>
> Key: HBASE-12716
> URL: https://issues.apache.org/jira/browse/HBASE-12716
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.98.6
> Reporter: Weichen Ye
> Assignee: Weichen Ye
> Attachments: HBASE-12716.patch
>
>
> I`m working for another issues HBASE-12590 and trying to use the UniformSplit
> algorithm in RegionSplitter. When the last bytes of start key and end key are
> adjacent in alphabetical order or ASCII order, the UniformSplit algorithm
> meet an NPE.
> Like startkey: aaa, endkey :aab
> startkey:1111 endkey: 1112
> For example, we write this simple test code:
> {code}
> import org.apache.hadoop.hbase.util.RegionSplitter.UniformSplit;
> ......
> byte[] a1 = { 'a', 'a', 'a' };
> byte[] a2 = { 'a', 'a', 'b' };
> UniformSplit us = new UniformSplit();
> byte[] mid = us.split(a1, a2);
> ......
> {code}
> We will get the ERROR:
> {code}
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.util.RegionSplitter$UniformSplit.split(RegionSplitter.java:986)
> {code}
> We hope this algorithm should be able to calculate the split point with an
> additional byte. for example:
> "aaa" and "aab", split point= "aaaP"
> "1111" and "1112", split point ="1111P"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)