[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096415#comment-14096415 ] Alex Newman commented on HBASE-10017: - amazing! > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > TEST-org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.xml.gz, > TestHRegionServerBulkLoad-more-splits.txt, > TestHRegionServerBulkLoad-more-splits.txt, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096414#comment-14096414 ] Hadoop QA commented on HBASE-10017: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617251/TestHRegionServerBulkLoad-more-splits.txt against trunk revision . ATTACHMENT ID: 12617251 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10416//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > TEST-org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.xml.gz, > TestHRegionServerBulkLoad-more-splits.txt, > TestHRegionServerBulkLoad-more-splits.txt, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096275#comment-14096275 ] Alex Newman commented on HBASE-10017: - I assume this has been fixed? > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > TEST-org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.xml.gz, > TestHRegionServerBulkLoad-more-splits.txt, > TestHRegionServerBulkLoad-more-splits.txt, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874103#comment-13874103 ] Nick Dimiduk commented on HBASE-10017: -- The wind has fallen from the sails on this issue. Can the reported data loss be confirmed and corrected (my attempts were unsuccessful)? If so, let's pump to blocker and get it fixed for 0.98. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > TEST-org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.xml.gz, > TestHRegionServerBulkLoad-more-splits.txt, > TestHRegionServerBulkLoad-more-splits.txt, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840725#comment-13840725 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617251/TestHRegionServerBulkLoad-more-splits.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8065//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > TEST-org.apache.hadoop.hbase.mapreduce.IntegrationTestBulkLoad.xml.gz, > TestHRegionServerBulkLoad-more-splits.txt, > TestHRegionServerBulkLoad-more-splits.txt, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839786#comment-13839786 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617102/TestHRegionServerBulkLoad-more-splits.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8059//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > TestHRegionServerBulkLoad-more-splits.txt, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839702#comment-13839702 ] Enis Soztutar commented on HBASE-10017: --- bq. although not sure whether it handles multiple splits to the same range or merges Nick pointed out that we are actually splitting those files by re-writing those files. I thought that we were creating actual reference files. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839703#comment-13839703 ] Nick Dimiduk commented on HBASE-10017: -- Multiple splits are handled through retrying. Splits are made and the halves rewritten as independent HFiles with each pass, so this should be okay. [~rn] I'm very concerned about the bulkload data loss issue, but I cannot reproduce it using our existing unit tests (TestHRegionServerBulkLoad). Are you able to demonstrate the loss in a test? As [~enis] said, TOP should be used for generating HFiles files. Bulkload itself isn't performed inside a mapreduce job, so I'm confused about how the HRegionPartitioner comes into play in this scenario. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839681#comment-13839681 ] Enis Soztutar commented on HBASE-10017: --- bq. I have reproduced data loss during bulk load. This happens under the same conditions as initial bug. 16 regions per table, I think it's not the only case. Again, partitioner wrongly maps last region data and resulting region HFile contains keys that shall not appear there. This partitioner is not intended to be used by bulk load. It is already there in the javadoc. TotalOrderPartioner should be used instead. If there are changes to regions, LoadIncrementalFiles checks the boundaries (although not sure whether it handles multiple splits to the same range or merges). Other than that, the changes seems ok. However, I think we should get the region boundaries at the start, and treat the range as immutable for the lifetime of the partitioner. Although the table regions might go underlying changes, we can at least guarantee a consistent mapping for key ranges. We can to a table.getStartKeys() and do a binary search for the key range considering the special region boundaries (empty start and stop rows). > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838108#comment-13838108 ] Ted Yu commented on HBASE-10017: @Roman: I understand the impact of this bug - I was waiting for Nick to take a look at your patch. I don't think mvn site build issue should be tackled in this JIRA. I would be boarding a flight in one hour. +1 from me. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838101#comment-13838101 ] Roman Nikitchenko commented on HBASE-10017: --- Guys, I have reproduced data loss during bulk load. This happens under the same conditions as initial bug. 16 regions per table, I think it's not the only case. Again, partitioner wrongly maps last region data and resulting region HFile contains keys that shall not appear there. Ted, do you understand impact of this bug? 1. Functional defect of out-of-the-box component [major?] 2. Wrong distribution of load for lot of MR cases starting from 30 reducers [another major] => By my opinion [critical] in total. 3. DATA LOSS probability during bulk load in stable release (0.94 at least) I don't want to panic but... I start to think this is [bloker]. As far as I can see 'mvn site' fails in your environment because of this and I assume it's true: --- [ERROR] Exit code: 1 - javadoc: error - java.lang.OutOfMemoryError: Please increase memory. --- You can check HBASE-8758. No any functional check and again, 'site' fails. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837964#comment-13837964 ] Roman Nikitchenko commented on HBASE-10017: --- It looks like this partitioner defect also can cause some data loss in incremental HFile loading based on HFileOutputFormat configured MR job. This is because they use such partitioner to partition records between HFile and records might get to wrong region. Too much effect from too easy to fix issue. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837939#comment-13837939 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616816/patchSiteOutput.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8051//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837954#comment-13837954 ] Ted Yu commented on HBASE-10017: Using your command, I got the following : {code} Constructing Javadoc information... 1 error [INFO] [INFO] Reactor Summary: [INFO] [INFO] HBase . FAILURE [2:20.834s] [INFO] HBase - Common SKIPPED [INFO] HBase - Protocol .. SKIPPED [INFO] HBase - Client SKIPPED [INFO] HBase - Hadoop Compatibility .. SKIPPED [INFO] HBase - Hadoop One Compatibility .. SKIPPED [INFO] HBase - Prefix Tree ... SKIPPED [INFO] HBase - Server SKIPPED [INFO] HBase - Testing Util .. SKIPPED [INFO] HBase - Thrift SKIPPED [INFO] HBase - Shell . SKIPPED [INFO] HBase - Integration Tests . SKIPPED [INFO] HBase - Examples .. SKIPPED [INFO] HBase - Assembly .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 2:21.450s [INFO] Finished at: Tue Dec 03 09:55:00 PST 2013 [INFO] Final Memory: 50M/791M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-site-plugin:3.3:site (default-site) on project hbase: Error during page generation: Error rendering Maven report: [ERROR] Exit code: 1 - javadoc: error - java.lang.OutOfMemoryError: Please increase memory. [ERROR] For example, on the Sun Classic or HotSpot VMs, add the option -J-Xmx [ERROR] such as -J-Xmx32m. [ERROR] [ERROR] Command line was: /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/bin/javadoc @options @packages [ERROR] [ERROR] Refer to the generated Javadoc files in '/Users/tyu/trunk/target/site/devapidocs' dir. [ERROR] -> [Help 1] {code} > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, > patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832819#comment-13832819 ] Ted Yu commented on HBASE-10017: Here is the command Hadoop QA used for site generation: {code} $MVN compile site -DskipTests -D${PROJECT_NAME}PatchProcess > $PATCH_DIR/patchSiteOutput.txt 2>&1 {code} Did it work for you ? What OS do you use ? On Mac, I got: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-site-plugin:3.3:site (default-site) on project hbase: Error during page generation: Error rendering Maven report: [ERROR] Exit code: 1 - javadoc: error - java.lang.OutOfMemoryError: Please increase memory. [ERROR] For example, on the Sun Classic or HotSpot VMs, add the option -J-Xmx [ERROR] such as -J-Xmx32m. {code} > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832794#comment-13832794 ] Roman Nikitchenko commented on HBASE-10017: --- Definitely, HBASE-8758 has NO ANY code change but comments and failed core tests last time (yesterday). As for this log, the same patch, javadoc have failed and site behavior is somewhat strange as it 100% passes on my local host. @Ted doesn't it looks for you guys you have pretty unstable enviromnent? BTW can allocate some time to assist (not so much) but probably need initial guidance (as HBase is one of most important tools for our company in near perspective). ? > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832769#comment-13832769 ] Ted Yu commented on HBASE-10017: The findbugs and javadoc warnings may have come from HBASE-7544 > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832765#comment-13832765 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615847/HBASE-10017-r1544633.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7997//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832665#comment-13832665 ] Roman Nikitchenko commented on HBASE-10017: --- If it is enough to attach, it's done. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832642#comment-13832642 ] Ted Yu commented on HBASE-10017: bq. -1 core tests. The patch failed these unit tests: I was more interested in finding out which test(s) possibly hung. You can attach your patch again to trigger another QA run. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832628#comment-13832628 ] Roman Nikitchenko commented on HBASE-10017: --- BTW, HBASE-8758 fix does not modify anything but comment. But core tests fail on it too in accordance to the report. Something is pretty wrong with this facility. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832626#comment-13832626 ] Roman Nikitchenko commented on HBASE-10017: --- @Ted, @Nick I had both local 'mvn test' and 'mvn site' working with this fix. Here I found only warnings and as I remember there were no new in hbase-server. This patch modifies only hbase-server as you can see. Probably if I knew there is so small term of storing test results I'd saved them (results) but AFAIR no new warnings. Again, is it possible to submit patch for some kind of automatic testing farm? > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832023#comment-13832023 ] Ted Yu commented on HBASE-10017: @Roman: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//testReport/ is no longer accessible. Did you have a chance to see what caused the test failure ? > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830425#comment-13830425 ] Nick Dimiduk commented on HBASE-10017: -- Thanks for digging further into this. I'll have a deeper look when I get back from holiday. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830283#comment-13830283 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615368/HBASE-10017-r1544633.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7971//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko >Priority: Critical > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830224#comment-13830224 ] Roman Nikitchenko commented on HBASE-10017: --- Yet another note about fix proposed. One even worse problem noted here is not equal distribution among the partitions. Let us start with something like 30 reducers or more. In this case there will be non-equal distribution. Here is illustratve code: public class Main { public static void main(String [] args) throws Exception { int numPartitions = 32; int numRegions = 100; int[] parts = new int[numPartitions]; for (int i = 0; i < numRegions; ++i) { int part = (Integer.toString(i).hashCode() & Integer.MAX_VALUE) % numPartitions; parts[part]++; } for (int i = 0; i < numPartitions; ++i) { System.out.println(parts[i] + " "); } } } Being run it produces histogram with up to 5 times difference in load per reducer which is COMPLETELY unacceptable. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko > Attachments: HBASE-10017-r1544633.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13829014#comment-13829014 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615133/HBASE-10017-v0.94.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7962//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko > Fix For: 0.94.6 > > Attachments: HBASE-10017-v0.94.6.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13829202#comment-13829202 ] Nick Dimiduk commented on HBASE-10017: -- Thanks for the patch Roman. Can you also include a test for the issue, something that fails on trunk but passes with your patch? Thanks. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko > Attachments: HBASE-10017-r1544212.patch, HBASE-10017-v0.94.6.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
[ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13829096#comment-13829096 ] Hadoop QA commented on HBASE-10017: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615135/HBASE-10017-r1544212.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7963//console This message is automatically generated. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > --- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce >Affects Versions: 0.94.6 >Reporter: Roman Nikitchenko > Fix For: 0.94.6 > > Attachments: HBASE-10017-r1544212.patch, HBASE-10017-v0.94.6.patch > > > Inside HRegionPartitioner class there is getPartition() method which should > map first numPartitions regions to appropriate partitions 1:1. But based on > condition last region is hashed which could lead to last reducer not having > any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect > was found in 0.94.6 but at least today's trunk and 0.91 branch head have the > same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)