[
https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838101#comment-13838101
]
Roman Nikitchenko commented on HBASE-10017:
-------------------------------------------
Guys,
I have reproduced data loss during bulk load. This happens under the same
conditions as initial bug. 16 regions per table, I think it's not the only
case. Again, partitioner wrongly maps last region data and resulting region
HFile contains keys that shall not appear there.
Ted, do you understand impact of this bug?
1. Functional defect of out-of-the-box component [major?]
2. Wrong distribution of load for lot of MR cases starting from 30 reducers
[another major] => By my opinion [critical] in total.
3. DATA LOSS probability during bulk load in stable release (0.94 at least) I
don't want to panic but... I start to think this is [bloker].
As far as I can see 'mvn site' fails in your environment because of this and I
assume it's true:
---
[ERROR] Exit code: 1 - javadoc: error - java.lang.OutOfMemoryError: Please
increase memory.
---
You can check HBASE-8758. No any functional check and again, 'site' fails.
> HRegionPartitioner, rows directed to last partition are wrongly mapped.
> -----------------------------------------------------------------------
>
> Key: HBASE-10017
> URL: https://issues.apache.org/jira/browse/HBASE-10017
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 0.94.6
> Reporter: Roman Nikitchenko
> Priority: Critical
> Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch,
> patchSiteOutput.txt
>
>
> Inside HRegionPartitioner class there is getPartition() method which should
> map first numPartitions regions to appropriate partitions 1:1. But based on
> condition last region is hashed which could lead to last reducer not having
> any data. This is considered serious issue.
> I reproduced this only starting from 16 regions per table. Original defect
> was found in 0.94.6 but at least today's trunk and 0.91 branch head have the
> same HRegionPartitioner code in this part which means the same issue.
--
This message was sent by Atlassian JIRA
(v6.1#6144)