Hi all,
I'm using HBase 0.94.2 (and Hadoop 1.0.4).
I'm using bulk load on daily basis for over a year with no problem.
I recently moved to an OSGi client, and that required some changes.
One of tha changes I made is a fix to what seems like a bug that I
described in https://issues.apache.org/jira/browse/HBASE-9682
While running some tests I executed bulk load (with pre-splitting) a few
times and in one of the times it seems that bulk load didn't identify the
pre-split regions and loaded the HFiles into 2 new regions (instead of 19
pre-split). What's even worse is that it made a mess of lexicographical
order of start/end keys in those regions.
for example:
if pre-split reginos start/end keys were:
Start End
1
1 2
2 3
3
It turned to:
Start End
new1
1 2
new1
2 3
3
So that even scanning over those regions is impossible.
I'm having hard time recreating this behavior so I'm not sure it's the fix
I did (also described in the Jira comments).
Any ideas ?
Thanks,
Amit