subject:"\[jira\] \[Commented\] \(HBASE\-15097\) When the scan operation covered two regions,sometimes the final results have duplicated rows."

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-31 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15125824#comment-15125824
 ] 

Anoop Sam John commented on HBASE-15097:


IMHO we should not be adding this kind of extra checks but should fix the 
actual issue.. (As u hinted might be with split regions read path)


> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> HBASE-15097-v003.patch, HBASE-15097-v004.patch, output.log, rowkey.txt, 
> snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-29 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123183#comment-15123183
 ] 

chenrongwei commented on HBASE-15097:
-

hi,how about this issue?  @Anoop Sam John  @Ted yu

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> HBASE-15097-v003.patch, HBASE-15097-v004.patch, output.log, rowkey.txt, 
> snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-23 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113755#comment-15113755
 ] 

chenrongwei commented on HBASE-15097:
-

I think it won't happen.There are two situations under no stop row had been set.
1: table only have one region,(null,null)
2: table has more than one region, such as 
(null,region_1_endKey)...[region_n-1_startKey, region_n_startKey), 
[region_n_startKey,null).
if table only have one region,there is no this problem obviously,because of all 
data in the same region,so we just to see the second situation.
Under the second situation,if we not per the patch,according to the region 
maybe hold the old data which maybe belong to this region before its splitting, 
so that the scan operation will maybe get duplicate rows.But I think this 
mistake,which the region scan get old data, would just happen in the region 
except the last one. Because there is no rowkey can out of its end key(null),so 
the last region always has the newest data,according to this reason,we just 
need to make sure other regions don't happen this mistake,then we will make the 
scan avoid getting old data,and we per this patch just do that thing. 

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> HBASE-15097-v003.patch, HBASE-15097-v004.patch, output.log, rowkey.txt, 
> snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-22 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113578#comment-15113578
 ] 

Anoop Sam John commented on HBASE-15097:


bq.,I said no,if we set the correct stop row,then it will not happen,because 
the error will not happen at the last region.
I mean when the user specified Scan do not have any stop row.  Then as per the 
patch also, we wont be setting the stop row.  In that case will it be possible 
to get a row out of it boundary from this region?

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> HBASE-15097-v003.patch, HBASE-15097-v004.patch, output.log, rowkey.txt, 
> snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-20 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108220#comment-15108220
 ] 

Hadoop QA commented on HBASE-15097:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
42s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
21s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 50s 
{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 40s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 165m 18s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} 
|
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 107m 19s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
27s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 326m 26s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0 Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.util.TestHBaseFsckOneRS |
|   | hadoop.hbase.master.balancer.TestStochasticLoadBalancer |
| JDK v1.8.0 Timed out junit tests | 
org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot |
| JDK v1.7.0_79 Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.master.balancer.TestStochasticLoadBalancer |
| JDK v1.7.0_79 Timed out junit tests | 
org.apache.hadoop.hbase.security.access.TestAccessController3 |
|   | org.apache.hadoop.hbase.TestZooKeeper |
|   | org.apache.hadoop.hb

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106315#comment-15106315
 ] 

chenrongwei commented on HBASE-15097:
-

I think HBASE-15125 will cause data out of its boundary,but obviously it's not 
the only way.

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106309#comment-15106309
 ] 

chenrongwei commented on HBASE-15097:
-

Yes,you're right. we should find and fix the root cause finally, and it's more 
important to us actually. but I think it's still important to avoid the bug by 
setting the correct stop row and it's more easier done. For example, if we 
don't do this,maybe it's more difficult to find bugs like HBASE-15125. 
Additionally,"If the scan is not specifying any stopRow and because of the said 
bug, an out of boundary row can come out right? ",I said no,if we set the 
correct stop row,then it will not happen,because the error will not happen at 
the last region.




> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106284#comment-15106284
 ] 

Anoop Sam John commented on HBASE-15097:


Actually whatever be the stopRow value being set, the under layers of scan 
should not give a row outside the regions boundary. IMHO, we should investigate 
how that got broken and fix that issue.  If the scan is not specifying any 
stopRow and because of the said bug, an out of boundary row can come out right? 
 We should fix the root cause.

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105496#comment-15105496
 ] 

Ted Yu commented on HBASE-15097:


The last test run was done on https://builds.apache.org/computer/H2 where local 
maven repo may contain outdated artifacts.
This led to "hadoopcheck Patch causes 11 errors with Hadoop v2.4.0.", seen in 
other QA runs before.


> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105316#comment-15105316
 ] 

chenrongwei commented on HBASE-15097:
-

Thanks for your reminder. I think there are other two bugs which had been 
hidden by this bug. I also had submit the two bugs just in HBASE-15125 and 
HBASE-15126. I think that we have  to solve that two bugs first before this. 
Additionally,I don't understand what does "hadoopcheck Patch causes 11 errors 
with Hadoop v2.4.0." means?


> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105258#comment-15105258
 ] 

Ted Yu commented on HBASE-15097:


Can you check whether hbck test failure was related to the patch ?

Please modify the subject of JIRA to closely reflect what the patch does. 

Thanks

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, HBASE-15097-v002.patch, 
> output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105234#comment-15105234
 ] 

Hadoop QA commented on HBASE-15097:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
43s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 
3s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 9s 
{color} | {color:red} hbase-server in master has 82 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 34s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.4.0. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 3m 3s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.4.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 4m 38s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.5.0. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 6m 12s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.5.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 7m 42s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.5.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 9m 15s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.6.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 10m 50s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.6.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 12m 36s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.6.3. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 14m 20s 
{color} | {color:red} Patch causes 11 errors with Hadoop v2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} the patc

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-15 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102039#comment-15102039
 ] 

chenrongwei commented on HBASE-15097:
-

Sorry,I didn't consider the situation of the reverse scan,i will fix that 
problem and make sure it's ok.

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, output.log, rowkey.txt, 
> snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-15 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101907#comment-15101907
 ] 

Ted Yu commented on HBASE-15097:


Is the fix correct ?

There were test failures in TestFromClientSideXX

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Attachments: HBASE-15097-v001.patch, output.log, rowkey.txt, 
> snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101715#comment-15101715
 ] 

Hadoop QA commented on HBASE-15097:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
24s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 
16s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 55s 
{color} | {color:red} hbase-server in master has 83 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 5m 19s 
{color} | {color:red} Patch generated 2 new checkstyle issues in hbase-server 
(total was 294, now 296). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 51s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 26s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 7s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
13s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 243m 59s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0 Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.util.TestHBaseFsckOneRS |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
| JDK v1.7.0_79 Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.util.TestHBaseFsckOneRS |
|   | hadoop.hbase.clie

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101706#comment-15101706
 ] 

Hadoop QA commented on HBASE-15097:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
19s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
25s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 52s 
{color} | {color:red} hbase-server in master has 83 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 7s 
{color} | {color:red} Patch generated 2 new checkstyle issues in hbase-server 
(total was 294, now 296). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
20m 46s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 7s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 10s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 243m 3s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0 Failed junit tests | hadoop.hbase.util.TestHBaseFsckOneRS |
|   | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
| JDK v1.7.0_79 Failed junit tests | hadoop.hbase.util.TestHBaseFsckOneRS |
|   | hadoop.hbase.regionserver.TestHRegion |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.cl

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-13 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096178#comment-15096178
 ] 

chenrongwei commented on HBASE-15097:
-

I think there maybe exist some bug in the progress of region splitting which 
leads to the region still kept the data out of its boundary. And we can avoid 
this problem easily just by add a simple judgement.The problem's more detailed 
information just like i described as below answer.

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Fix For: 1.1.2
>
> Attachments: output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-13 Thread chenrongwei (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096136#comment-15096136
 ] 

chenrongwei commented on HBASE-15097:
-

I think there maybe exist some bug in the progress of region splitting which 
leads to the region still keep the data beyond its' end key.
Here is my test code,
public static void main(String[] args) {
Configuration configuration = HBaseConfiguration.create();
Connection connection = null;
try {
FileOutputStream output = new FileOutputStream("rowkey.txt");
connection = ConnectionFactory.createConnection(configuration);
TableName tableName = TableName.valueOf("xsearch_solr");
Table theTestTable = connection.getTable(tableName);
Scan scan = new Scan(Bytes.toBytes("bbf8f2d400232958622"),
Bytes.toBytes("bff8f2d400232958623"));
scan.setCaching(4000);
scan.setMaxVersions(1);
scan.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("userid"));
long beginTime = System.nanoTime();
int hits = 0;
ResultScanner resultScanner = theTestTable.getScanner(scan);
Result[] results = resultScanner.next(4000);
while (results != null && results.length > 0) {
for (Result aResult : results) {
output.write(aResult.getRow());
output.write("\n".getBytes());
if 
("bff8f2d400232958622".equals(Bytes.toString(aResult.getRow( {
System.out.println("rowid=" + 
Bytes.toString(aResult.getRow()) + ",timestamp=" + aResult
.getColumnLatestCell(Bytes.toBytes("cf"), 
Bytes.toBytes("userid")).getTimestamp());
hits++;
}
}
results = resultScanner.next(4000);
}

long endTime = System.nanoTime();
output.close();
System.out.println("query cost=" + (endTime - beginTime) + "ns" + 
",hits=" + hits);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (connection != null) {
try {
connection.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

the code's output is like bellow,

rowid=bff8f2d400232958622,timestamp=1452223831551
rowid=bff8f2d400232958622,timestamp=1452685378997
query cost=24923466628ns,hits=2

Please check the snapshot file to get the the table's current region info, and 
you can check the rowkey.txt find the duplicated rows,such as 
'bff8e36c00244275031'.

I had checked the trace log file 'output.log', then i found that the scan 
operation's detail info,but i don't know why the region's hfile still keep the 
old data which has beyond its end key.




> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Fix For: 1.1.2
>
> Attachments: output.log, rowkey.txt, snapshot2016-01-13 pm 8.42.37.png
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

2016-01-13 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096091#comment-15096091
 ] 

Anoop Sam John commented on HBASE-15097:


How come that region got data out of its boundary?  Can u try reproduce this 
with a test case and attach the same here?

> When the scan operation covered two regions,sometimes the final results have 
> duplicated rows.
> -
>
> Key: HBASE-15097
> URL: https://issues.apache.org/jira/browse/HBASE-15097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.1.2
> Environment: centos 6.5
> hbase 1.1.2 
>Reporter: chenrongwei
>Assignee: chenrongwei
> Fix For: 1.1.2
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When the scan operation‘s start key and end key covered two regions,the first 
> region returned the rows which were beyond of its' end key.So,this finally 
> leads to duplicated rows in the results.
> To avoid this problem,we should add a judgment before setting the variable 
> "stopRow" in the class of HRegion,like follow:
> if (Bytes.equals(scan.getStopRow(), HConstants.EMPTY_END_ROW) && 
> !scan.isGetScan()) {
> this.stopRow = null;
> } else {
> if (Bytes.compareTo(scan.getStopRow(), 
> this.getRegionInfo().getEndKey()) >= 0) {
> this.stopRow = this.getRegionInfo().getEndKey();
> } else {
> this.stopRow = scan.getStopRow();
> }
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

[jira] [Commented] (HBASE-15097) When the scan operation covered two regions,sometimes the final results have duplicated rows.

19 matches

Site Navigation

Mail list logo

Footer information