[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-05-05 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997940#comment-15997940
 ] 

Duo Zhang commented on HBASE-17712:
---

Oh we have HBASE-2231 which is aimed to fix the file not found problem by 
replaying the compaction marker... Then why it still happens?

 [~stack] [~enis]

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-addendum.patch, HBASE-17712-branch-1.patch, 
> HBASE-17712.patch, HBASE-17712-ut.patch, HBASE-17712-v1.patch, 
> HBASE-17712-v2.patch, HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-08 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902518#comment-15902518
 ] 

Duo Zhang commented on HBASE-17712:
---

{quote}
Does the FNFE have the file name in it?
{quote}
I believe so.

{quote}
The AsyncFSWAL.java changes are related?
{quote}

Yeah it is related. In RS.abort will wait for all regions to be closed. But for 
AsyncFSWAL, we will retry forever so there is a dead lock. Although I think it 
is a bit strange that we still need to confirm region closing when aborting a 
RS, but the check in AsyncFSWAL is no harm so I include it in the patch. We can 
discuss later if we need to wait in RS.abort.

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-branch-1.patch, HBASE-17712.patch, 
> HBASE-17712-ut.patch, HBASE-17712-v1.patch, HBASE-17712-v2.patch, 
> HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-08 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15902508#comment-15902508
 ] 

stack commented on HBASE-17712:
---

+1 on patch.

Nit: Change this '  LOG.warn("A store file got lost", fnfe);' to add ' ... 
so closing and reopening region...' so this log and the subsequent close/open 
are tied. Does the FNFE have the file name in it?

The AsyncFSWAL.java changes are related?

Good test.

[~Apache9]

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-branch-1.patch, HBASE-17712.patch, 
> HBASE-17712-ut.patch, HBASE-17712-v1.patch, HBASE-17712-v2.patch, 
> HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901186#comment-15901186
 ] 

Hudson commented on HBASE-17712:


SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #2634 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/2634/])
HBASE-17712 Remove/Simplify the logic of (zhangduo: rev 
58c76192bdbf1f4863c1c87d165c2e3b9674d4ad)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionInDeadRegionServer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionUnassigner.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AsyncFSWAL.java


> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-branch-1.patch, HBASE-17712.patch, 
> HBASE-17712-ut.patch, HBASE-17712-v1.patch, HBASE-17712-v2.patch, 
> HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901066#comment-15901066
 ] 

Hudson commented on HBASE-17712:


FAILURE: Integrated in Jenkins build HBase-1.4 #659 (See 
[https://builds.apache.org/job/HBase-1.4/659/])
HBASE-17712 Remove/Simplify the logic of (zhangduo: rev 
dcaa9bd7155ef6f2003bdb780239499fc450fc1e)
* (add) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionInDeadRegionServer.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockRegionServer.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCorruptedRegionStoreFile.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/MockRegionServerServices.java
* (add) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionUnassigner.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java


> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-branch-1.patch, HBASE-17712.patch, 
> HBASE-17712-ut.patch, HBASE-17712-v1.patch, HBASE-17712-v2.patch, 
> HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-07 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900785#comment-15900785
 ] 

Duo Zhang commented on HBASE-17712:
---

The failed UTs are unrelated and can pass locally.

Will commit shortly.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-branch-1.patch, HBASE-17712.patch, 
> HBASE-17712-ut.patch, HBASE-17712-v1.patch, HBASE-17712-v2.patch, 
> HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15900771#comment-15900771
 ] 

Hadoop QA commented on HBASE-17712:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
53s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
58s {color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} branch-1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 57s 
{color} | {color:red} hbase-server in branch-1 has 2 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} branch-1 passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
15m 24s {color} | {color:green} The patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 20s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 117m 0s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.mapred.TestTableSnapshotInputFormat |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:e01ee2f |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856737/HBASE-17712-branch-1.patch
 |
| JIRA Issue | HBASE-17712 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 806196208b4e 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/hbase.sh |
| git revision | branch-1 / 

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-07 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899376#comment-15899376
 ] 

Duo Zhang commented on HBASE-17712:
---

Will commit tomorrow if no objections. Then I could start working on other 
related issues.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712.patch, HBASE-17712-ut.patch, 
> HBASE-17712-v1.patch, HBASE-17712-v2.patch, HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-06 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898535#comment-15898535
 ] 

Duo Zhang commented on HBASE-17712:
---

What do you think of the new approach sir? [~stack]

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712.patch, HBASE-17712-ut.patch, 
> HBASE-17712-v1.patch, HBASE-17712-v2.patch, HBASE-17712-v3.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897471#comment-15897471
 ] 

Hadoop QA commented on HBASE-17712:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
1s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
43s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 14s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 58s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 135m 37s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856267/HBASE-17712-v3.patch |
| JIRA Issue | HBASE-17712 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 7c9709d2d23b 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d2349c6 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5963/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5963/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5963/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>  

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897174#comment-15897174
 ] 

Hadoop QA commented on HBASE-17712:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
43s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
26s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
36m 38s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 50s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
11s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 19s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.io.TestHeapSize |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856251/HBASE-17712-v2.patch |
| JIRA Issue | HBASE-17712 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux dfaaa7fc680f 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / d2349c6 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5962/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/5962/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5962/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5962/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: 

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-05 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15896259#comment-15896259
 ] 

Duo Zhang commented on HBASE-17712:
---

Yeah there could still be holes that the RS pauses after writing the compaction 
marker, although it is much rarer than before. Simply adding new checks can not 
solve the problem-long GC can always occurs after your check and before the 
actual deletion.

As we can not remove the storefiles immediately after compaction because it may 
still be read by someone, it is not possible to solve it by atomic operations 
on HDFS. And a possible way is to store the storefile list in meta table, and 
do a checkAndPut when updating it to confirm that the region is still holding 
by us. This could be done in the future, but it is not a easy work as we need 
to deal with region split/merge, flush, etc. So I do not think it is the right 
time to do this as the problem we want to address rarely rarely happens. Maybe 
we could bring this up when we want to put storefiles on a FileSystem that does 
not support listing?

Let me try to solve it by another way describe in HBASE-13651 - reassigning the 
region. It is a little costly and slow but given its possibility, I think it is 
acceptable.

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712.patch, HBASE-17712-ut.patch, 
> HBASE-17712-v1.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895761#comment-15895761
 ] 

stack commented on HBASE-17712:
---

This patch looks excellent. Tried to reason if any 'holes' such that we write 
the compaction marker, crash, and then the region in new location experiences 
FNFE after open in new location but I don't see any

Is the AsyncFSWAL.java inclusion intentional?

This is nice cleanup and leveraging of a cornerstone laid a good while back.  
Thanks for the deep-thinking [~Apache9] to get us back some simplification. If 
an issue, it'll be easier to reason about after this patch goes in. +1 if all 
tests pass.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712.patch, HBASE-17712-ut.patch, 
> HBASE-17712-v1.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895752#comment-15895752
 ] 

Hadoop QA commented on HBASE-17712:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
41s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
1s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 48s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 36s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 129m 20s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856055/HBASE-17712-v1.patch |
| JIRA Issue | HBASE-17712 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 43f5071a2741 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 404a288 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5952/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5952/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712.patch, 

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895691#comment-15895691
 ] 

Hadoop QA commented on HBASE-17712:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
19s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
45s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
44s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 19s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s 
{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 26s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
11s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 9s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Naked notify in 
org.apache.hadoop.hbase.regionserver.HRegion.refreshStoreFiles()  At 
HRegion.java:At HRegion.java:[line 5162] |
| Failed junit tests | hadoop.hbase.io.TestHeapSize |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12856051/HBASE-17712.patch |
| JIRA Issue | HBASE-17712 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux de48b7a64ac5 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 404a288 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5951/artifact/patchprocess/new-findbugs-hbase-server.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5951/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/5951/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5951/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894707#comment-15894707
 ] 

Hadoop QA commented on HBASE-17712:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
47s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
41s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 47s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 14s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 131m 8s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12855869/HBASE-17712-ut.patch |
| JIRA Issue | HBASE-17712 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux d63b7e68d070 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 678ad0e |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5930/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5930/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/5930/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>  

[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-03 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894490#comment-15894490
 ] 

Duo Zhang commented on HBASE-17712:
---

[~stack] The UT is ready sir. If you agree, I can start working on remove the 
'handleFileNotFound' logic. Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17712-ut.patch
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-03 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894234#comment-15894234
 ] 

Duo Zhang commented on HBASE-17712:
---

Yeah I'm currently working on implementing the tests to prove that the problem 
described in HBASE-16304 has gone.

And for the solution, my opinion is to remove the refreshStoreFiles call. If we 
have lot some store files then it must be a critial bug which could cause data 
loss. We should not eat it automatically.

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-02 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893086#comment-15893086
 ] 

stack commented on HBASE-17712:
---

bq. This not Ted Yu's fault.

Not much interested in 'blame'; asking for help/insight... if any to be had. 
dropMemstoreContents is added by HBASE-16304 made of parts taken from elsewhere 
in HRegion (which called through to the old dropMemstoreContentsForSeqId).

How you suggest we exploit your findings? A test to prove no FNFE anymore? And 
if this is so, undo all the protections and guards against FNFE with their 
reopening of files? How can I help sir [~Apache9]?









> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890495#comment-15890495
 ] 

Ted Yu commented on HBASE-17712:


Added a note at the end of HBASE-16304.

As Duo pointed out, dropMemstoreContents() call was not added by HBASE-16304.
Need to dig through related JIRAs to find out why.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-01 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890163#comment-15890163
 ] 

Duo Zhang commented on HBASE-17712:
---

This not [~tedyu]'s fault. Skimmed the comments in HBASE-16304, I do not think 
he knew the reason why we call dropMemstoreContents() for increment and append 
either. He just follow the old behavior. We can refreshStoreFiles in 
handleFileNotFound, and refreshStoreFiles will call dropMemstoreContents before 
HBASE-16304. And in append and increment we acquire write lock so it could lead 
to a dead lock then he moved the dropMemstoreContents out of write lock 
protection. The dropMemstoreContents is part of refreshStoreFiles, we split it 
into two pieces to avoid dead lock.

'refreshStoreFiles' is designed to be used by secondary replica only, and we 
reuse it in HBASE-13651 to handle FileNotFoundException. This is the root cause.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889990#comment-15889990
 ] 

stack commented on HBASE-17712:
---

bq. What happens if there is a flush ongoing at the same time?

I see. Looks like cruft built on top of cruft. Its a while since I was in here. 
Replacement of current set of hfiles was always a little awkward. We didn't 
want every access going across a synchronization just to check for the 
extremely rare case of a change in the store file Set. I'd have to do some 
archeology to see if retry of FNFE was a compromise so we could do w/o a sync 
check. Would be coolio if we could purge having to handle FNFE.

I don't follow the comment on why the call to dropMemstoreContents was added to 
doDelta by:

{code}
tree 11b5d28bb22d95bd5c6276346f3055412b2d6902
parent dda8f67b2cc9f6ef4ab434beea2a47d461a20a1f
author tedyu  Wed Aug 24 09:04:47 2016 -0700
committer tedyu  Wed Aug 24 09:04:47 2016 -0700

HBASE-16304 HRegion#RegionScannerImpl#handleFileNotFoundException may lead to 
deadlock when trying to obtain write lock on updatesLock

{code}

Looking at my review of HBASE-16304, my last remark was: "I'm not sure I follow 
the dropMemstoreContents(); bits. Some more commentary on interrelation might 
help" ... to which the response was that there was explanation (I don't see 
it...).  Ram asks what it is about later also It doesn't look like he got a 
straight response.

Can you help here [~tedyu]?

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-01 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889918#comment-15889918
 ] 

Duo Zhang commented on HBASE-17712:
---

{quote}
Want to give an illustration of what in particular is driving you crazy Duo 
Zhang?
{quote}
In HBASE-17633, I want to update the lowestUnflushedSequenceId in 
internalFlushCacheAndCommit using the memstore's minSequenceId. And then I 
found that we may modify the memstore content in refreshStoreFiles which is not 
part of the flush processing. After reading the code related to region replica, 
I found it is easy to handle as secondary replica does not handle write, and 
the replay is single threaded, no race condition. But at last I found that we 
even call dropMemstoreContents in doDelta! This is totally a mess.. I can not 
find a safe way to update the lowestUnflushedSequenceId if the minSequenceId is 
changed because of we drop some contents in memstore... What happens if there 
is a flush ongoing at the same time?

{quote}
Do we have tests that prove the latter assertion?
{quote}
I could try to add one.

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-03-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889894#comment-15889894
 ] 

stack commented on HBASE-17712:
---

bq.  I think sequence id accounting is your favorite part in HBase.

Thats funny. 

I keep promising a write-up on the life of a squenceid but its processing is in 
eternal flux. It is an afterthought on our base type, the KeyValue/Cell. It is 
not always present, cleared by compaction as an optimization after a 
near-arbitrary amount of time has elapsed, so a reluctance to lean on it in 
logic. This lack of clarity around fate of sequenceid is probably root cause of 
why sometimes it is treated with kid gloves while at other times it is used 
without locking (if we had hybrid logical clocks, sequenceid would inherent 
to timestamp, it would be always 'present', and it would be integral to Cell 
.. TODO).

Want to give an illustration of what in particular is driving you crazy 
[~Apache9]?

bq.  I do not think it is a good idea to just eat the exception and refresh 
store files. 

Agree... especially given  "... in 1.x release, the problem described in 
HBASE-13651 is gone." as Matteo says up in HBASE-13651.

Do we have tests that prove the latter assertion?

Thanks [~Apache9]






> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound

2017-02-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15889676#comment-15889676
 ] 

Duo Zhang commented on HBASE-17712:
---

This makes me crazy when implementing HBASE-17633 as the sequence id is used 
everywhere, with or without lock, even in primary replica... So finally I 
decide to open a issue to address this first.

[~stack] What's your opinion sir? I think sequence id accounting is your 
favorite part in HBase.

Thanks.

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -
>
> Key: HBASE-17712
> URL: https://issues.apache.org/jira/browse/HBASE-17712
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Duo Zhang
> Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated 
> after HBASE-16304 due to a dead lock issue. It is really tough as sequence id 
> is involved in and the method we called is used to serve secondary replica 
> originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we 
> will write a compaction marker to WAL before deleting the compacted files. We 
> can only consider a RS as dead after its WAL files are all closed so if the 
> region has already been reassigned the compaction will fail as we can not 
> write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a 
> critical bug which means we may loss data. I do not think it is a good idea 
> to just eat the exception and refresh store files. Or even if we want to do 
> this, we can just refresh store files without dropping memstore contents. 
> This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)