[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-11-29 Thread Bo Cui (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985239#comment-16985239
 ] 

Bo Cui commented on HBASE-22075:


Is it better to add the hbase-23353 solution?

pls check 

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.7, 2.2.3, 2.1.9
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-10-27 Thread HBase QA (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960770#comment-16960770
 ] 

HBase QA commented on HBASE-22075:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  4m 
46s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
15m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.8.5 2.9.2 or 3.1.2. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 base: 
https://builds.apache.org/job/PreCommit-HBASE-Build/976/artifact/patchprocess/Dockerfile
 |
| JIRA Issue | HBASE-22075 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972474/HBASE-22075.test-only.2.patch
 |
| Optional Tests | dupname asflicense javac javadoc unit shadedjars hadoopcheck 
xml compile spotbugs findbugs hbaseanti checkstyle |
| uname | Linux badf90de922d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / d7deafa120 |
| Default Java | 1.8.0_181 |
|  Test Results | 

[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-10 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882070#comment-16882070
 ] 

Sean Busbey commented on HBASE-22075:
-

bq. As for HBASE-16812, I do not think it is the only patch which affected MOB 
in a bad way - there should be others. I am saying that, because my own test 
failed with HBASE-16812 reverted (on HDP-2.6.5).

yeah, I agree here. I finished backporting HBASE-16812 onto CDH5.13.3 last 
night and the IT still shows no dataloss.

bq. To prevent non-atomic failures we will need acid txs? No?

To prevent cross-region non-atomic failures yes. But I don't think we need to 
prevent that; we just need to update the logic of committing the updated refs 
to handle non-atomic failure. the bulkload code assumes someone will work it 
out handling the failure externally and then we don't. I think instead we 
should use the per-region atomic commit of bulk loaded files to track when 
we've successfully made use of our newly compacted files. if we can't succeed 
on retry for a given region we can just keep around the old mob files for 
references in that region and then clean things up on the next mob compaction.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-08 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880569#comment-16880569
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

To prevent non-atomic failures we will need acid txs? No? As for HBASE-16812, I 
do not think it is the only patch which affected MOB in a bad way - there 
should be others. I am saying that, because my own test failed with HBASE-16812 
reverted (on HDP-2.6.5). 

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-08 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880489#comment-16880489
 ] 

Sean Busbey commented on HBASE-22075:
-

My current opinion is that there are a couple of different issues to solve here.

1) I found that all of the places we see this particular dataloss test show a 
problem include HBASE-16812. Before that change there's a lock preventing 
overlaps between compaction and mob compaction. CDH5's backport of the MOB 
feature does not include this change.

Since that change is too far back to easily revert in master or branches-2 I'm 
going to test this theory by backporting it on top of CDH5 and see if the IT 
then shows the dataloss. will report back.

2) Independent of the problem with races between compaction and mob compaction, 
I think the use of bulk load to commit the updated ref files is subject to 
non-atomic failure. We should either confirm that it isn't or rework how we 
commit the updated mob references. My intuition is that we should be able to do 
this region-by-region using the building blocks that bulk loading is based on 
without needing to completely overhaul mob accounting or mob compaction (e.g. 
we shouldn't need something like the distributed procedure based mob compaction 
from HBASE-15381)

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-02 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877463#comment-16877463
 ] 

Sean Busbey commented on HBASE-22075:
-

in CDH5 by default it's {{Max(10, # regions +1)}}, which is the same as master. 
Master gets the region count from the new async region locator, but presumably 
it gives the same answer.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-02 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877433#comment-16877433
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

[~busbey], what is the value of *hbase.bulkload.retries.number* in mob 
reference file loading in CDH5? Is it Integer.MAX_INT? 

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-02 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877429#comment-16877429
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

I think, *hbase.bulkload.retries.number* in a master branch is equals to # of 
regions in a table + 1. This should work in any case? The original (first one) 
patch to this ticket handles bulkhead failures due to insufficient # of 
retries, but can not handle race conditions created during  
*ReproMOBDataLoss.java* run. This can be not a MOB issue at all, my guess. 
Something not MOB related is different in CDH 5? May be general compaction code?





> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-07-02 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877397#comment-16877397
 ] 

Sean Busbey commented on HBASE-22075:
-

I've been having a lot of trouble trying to reproduce this failure using the IT 
on the CDH5 backport of the MOB feature. It's been confusing since the code 
that I think is responsible is essentially the same - bulk loading the updated 
references into the non-MOB regions.

At the end of last week I started looking at the bulk load code to see if that 
was the difference. [~npopa] suggested it was a difference in how much failures 
are retried by default. It doesn't look obviously different. However, if I take 
the master branch and update the {{PartitionedMobCompactor}} to set 
{{hbase.bulkload.retries.number}} to {{MAX_INT}} on the conf it uses when 
calling bulk load, then the IT no longer reproduces the failure. So even if it 
isn't specifically the bulkload retries that are different I think we're on the 
right track that currently the "if it fails just bail" approach is the source 
of the inconsistency.

I'm going to see if I can rework how we do committing the post-compaction 
references in the non-MOB regions to expressly handle error conditions.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870019#comment-16870019
 ] 

Sean Busbey commented on HBASE-22075:
-

v2 also reproduced the issue on branch-2.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869948#comment-16869948
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

So, basically what we have discovered and confirmed is : *MOB feature is not 
safe to use*. We have a patch, which fixes race condition issue and related 
data loss, but the patch is quite large and a major rework of a MOB feature 
itself. It has both: pluses and minuses: On a plus side - MOB compaction is not 
Master orchestrated anymore and can be run in parallel, on a minus side - MOB 
data is compacted during normal regular major compactions and it imposes 
additional I/O load. Kind of pro- and contra- thing.

Meanwhile, the simplest mitigation/solution to a data loss is conversion MOB 
table back to a regular one by setting *MOB_THRESHOLD* to a very large value, 
which must to be large than any potential MOB value size:

# Alter MOB table
{code}
hbase shell> disable ‘table’
hbase shell> alter ‘table’, {NAME => ‘column-family’, MOB_THRESHOLD => 
100}
hbase shell> enable ‘table’
{code}

# Run major compaction on this table
# Clean up MOB directory data for the table (its under 
/hbase/data/mobdir/data/'table')


> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread HBase QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869860#comment-16869860
 ] 

HBase QA commented on HBASE-22075:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
33s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 49s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.8.5 2.9.2 or 3.1.2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
44s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HBASE-Build/567/artifact/patchprocess/Dockerfile
 |
| JIRA Issue | HBASE-22075 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972474/HBASE-22075.test-only.2.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  shadedjars  
hadoopcheck  xml  compile  findbugs  hbaseanti  checkstyle  |
| uname | Linux f886a245da36 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 
13 15:00:41 UTC 2019 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 6d08ffcfc6 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/567/testReport/ |
| Max. process+thread count | 389 (vs. ulimit of 1) |
| modules | C: hbase-it U: hbase-it |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/567/console |
| Powered 

[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869837#comment-16869837
 ] 

Sean Busbey commented on HBASE-22075:
-

I had a reproduction attempt timeout because something went wonky that bubbled 
up to compactions failing. in v1 when that happens we immediately repeat the 
request instead of waiting. that eventually put enough pressure on things that 
my attempts to write failed, which just exited the data writing thread and left 
the test to wait for the full timeout.

test-only v2
 - move to java.util.concurrent Executor to handle periodic tasks and waiting 
for an async task to be done
 - if the data writing thread gets an error end the test as a failure right away

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> HBASE-22075.test-only.2.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread HBase QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869589#comment-16869589
 ] 

HBase QA commented on HBASE-22075:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
37s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m  1s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.8.5 2.9.2 or 3.1.2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
50s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HBASE-Build/566/artifact/patchprocess/Dockerfile
 |
| JIRA Issue | HBASE-22075 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972450/HBASE-22075.test-only.1.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  shadedjars  
hadoopcheck  xml  compile  findbugs  hbaseanti  checkstyle  |
| uname | Linux e98896e11ba7 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 
13 15:00:41 UTC 2019 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 6d08ffcfc6 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/566/testReport/ |
| Max. process+thread count | 393 (vs. ulimit of 1) |
| modules | C: hbase-it U: hbase-it |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/566/console |
| Powered 

[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869545#comment-16869545
 ] 

Sean Busbey commented on HBASE-22075:
-

test-only v1

- fixed checkstyle complaints.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, HBASE-22075.test-only.1.patch, 
> ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread HBase QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869261#comment-16869261
 ] 

HBase QA commented on HBASE-22075:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hbase-it: The patch generated 17 new + 0 unchanged - 0 
fixed = 17 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.8.5 2.9.2 or 3.1.2. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} hbase-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HBASE-Build/565/artifact/patchprocess/Dockerfile
 |
| JIRA Issue | HBASE-22075 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972408/HBASE-22075.test-only.0.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  shadedjars  
hadoopcheck  xml  compile  findbugs  hbaseanti  checkstyle  |
| uname | Linux 008288261fc5 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 
13 15:00:41 UTC 2019 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev-support/hbase-personality.sh |
| git revision | master / 6d08ffcfc6 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/565/artifact/patchprocess/diff-checkstyle-hbase-it.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/565/testReport/ |
| Max. 

[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-21 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869245#comment-16869245
 ] 

Sean Busbey commented on HBASE-22075:
-

the attached {{HBASE-22075.test-only.0.patch}} adapts [~kpalanisamy]'s 
reproduction into an integration test. If I let this new test run on my local 
laptop it currently produces a failure due to missing MOB files after about 1.5 
hours.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> HBASE-22075.test-only.0.patch, ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-06-20 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868918#comment-16868918
 ] 

Sean Busbey commented on HBASE-22075:
-

I think I have ReproMobDataLoss converted into an integration test. couple of 
changes and I should be able to post it.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-05-17 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842504#comment-16842504
 ] 

stack commented on HBASE-22075:
---

Moving out. Still WIP it seems.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.2.1, 2.1.6
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-05-03 Thread Karthik Palanisamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832954#comment-16832954
 ] 

Karthik Palanisamy commented on HBASE-22075:


[~elserj] Tested [~vrodionov] patch. This fix handled MOB data well safe. No 
data loss is noticed during MOB compaction failure, RS crash, IO issue, and any 
infra issues.

But majorly race condition is noticed w/ or w/o patch in the MOB. We see data 
loss again during MOB-compaction and Major-compaction while those are running 
together. 

As [~vrodionov] already mentioned, there will be a race condition in this case. 
 I think he already working on a new patch.

I have attached a small repro code (ReproMOBDataLoss.java) for this race 
condition. This is an aggressive test.  Test duration is nearly an hour.
 # Settings: Region Size 200 MB,  Flush threshold 800 KB.
 # Insert 10 Million records
 # MOB Compaction and Archiver
        a) Trigger MOB Compaction (every 2 minutes)
        b) Trigger major compaction (every 2 minutes)
        c) Trigger archive cleaner (every 3 minutes)
 # Validate MOB data after complete data load.

I ran this repro code on branch-2.2. The issue is reproduced. 

Also, ran this repro code after disabling MOB compaction. No data loss is 
noticed.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: compaction, mob
> Fix For: 2.0.6, 2.1.5, 2.2.1
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch, 
> ReproMOBDataLoss.java
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803562#comment-16803562
 ] 

Hadoop QA commented on HBASE-22075:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 7s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 29s{color} 
| {color:red} hbase-server generated 3 new + 191 unchanged - 3 fixed = 194 
total (was 194) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
12s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  3s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}262m 29s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}311m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.quotas.TestSpaceQuotas |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
|   | hadoop.hbase.master.procedure.TestSCPWithReplicas |
|   | 
hadoop.hbase.replication.multiwal.TestReplicationSyncUpToolWithMultipleWAL |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-22075 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963957/HBASE-22075-v2.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux f3692fd68972 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 59406c44e3 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| 

[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-27 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803402#comment-16803402
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

Patch v2, simplified. The only change in the code now: we do not delete 
compacted MOB file due to bulk load phase failure anymore. The reason is 
explained in previous comments. There is no need anymore to artificially 
inflate number of retries, as since post 2.0 LoadIncrementalHFiles tool 
calculates number of attempts, based on current number of regions in a table.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.6, 2.1.5
>
> Attachments: HBASE-22075-v1.patch, HBASE-22075-v2.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-26 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801941#comment-16801941
 ] 

Josh Elser commented on HBASE-22075:


Sorry! I think we were just talking past each other. Want to include your test 
in a second revision of your patch? Someone else can run your test to validate 
the issue being reproduced, and then with QA passing, we can call that a 
regression test :). I think that's sufficient for me.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.6, 2.1.5
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-26 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801925#comment-16801925
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

{quote}
 What about your sketch of a test doesn't work with your fix?
{quote}
Not sure, I am following you here, [~elserj]. Without fix, the issue can be 
reproduced - I have already working unit test. To show that the patch fixes the 
issue, we need to run this test twice: with patch disabled and enabled - this 
is what I call - hack. 

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.6, 2.1.5
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-26 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801730#comment-16801730
 ] 

Josh Elser commented on HBASE-22075:


bq. To reproduce this with the patch we will need some special test-related 
hacks in the code.

I'm curious: what you describe sounds like an effective test. What about your 
sketch of a test doesn't work with your fix?

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.1.4, 2.0.6
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-22 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799257#comment-16799257
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

{quote}
Is this something that we can show in a unit test?
{quote}

I have one, but it works only without patch, [~elserj]. 
Scenario is the following:
0. set "hbase.bulkload.retries.number" to 2
1. Create MOB table (1 region)
2. Load some of MOB data
3. Flush
4. Load MOB data again
5. Flush
6. Now we have 2 store files and 2 MOB file
7. Split table up to at least 4 regions
8. Trigger MOB compaction. Compaction should fail with some data loaded 
partially
9. Verify, that we have data missing.

To reproduce this with the patch we will need some special test-related hacks 
in the code. 

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-22 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799031#comment-16799031
 ] 

Josh Elser commented on HBASE-22075:


{quote}Forced splits during bulkload due to changed region boundaries, 
[~elserj].
{quote}
Ah, the import of the ref file itself being split (even though it has one 
entry). I'm with you now. I'll have to look at how we find MOB files work on 
split (there must be some logic for the daughter regions to find the MOB files 
from the parent region, right?).

So, we end up trying to bulk-load a ref file into two regions (after a split), 
one of them succeeds and one of them doesn't. Thus, that new ref file 
overwrites the old refs, and if we clean up the MOB files, we lose data. Ok, I 
can see that now :)

Is this something that we can show in a unit test? I feel like that would go a 
long way to help prevent data loss in the future.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798500#comment-16798500
 ] 

Hadoop QA commented on HBASE-22075:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
15s{color} | {color:red} hbase-server: The patch generated 1 new + 10 unchanged 
- 0 fixed = 11 total (was 10) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
33s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 36s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}146m 46s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}186m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncTableGetMultiThreaded |
|   | hadoop.hbase.master.TestMasterAbortAndRSGotKilled |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-22075 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963194/HBASE-22075-v1.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux a4f6d54c7996 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / cbd9c9b2e1 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| 

[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-21 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798418#comment-16798418
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

{quote}
What is creating multiple files? 
{quote}

Forced splits during bulkload due to changed region boundaries, [~elserj].

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-21 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798396#comment-16798396
 ] 

Josh Elser commented on HBASE-22075:


{quote}some files (or parts of files after splitting) can be loaded, some may 
fail
{quote}
What is creating multiple files? Doesn't this compact create a single MOB file 
and then one reference file to that file?

It looks like we write the MOB file in a tmp dir and then directly mv it into 
the mobFamily for the Region whose MOB files we're compacting. Maybe I need to 
look farther up the call stack to get it?
{quote}These are artifacts of some code cleaning/modifications. The fileName 
argument is not used at all in *bulkloadRefFile* method, but only bulkloadDir, 
where ref file is located.
{quote}
Ok, looks like {{bulkloadPathOfPartition}} is an ancestor of 
{{bulkloadColumnPath}} which also makes this harder to follow. Thanks for 
clarifying that.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-21 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798385#comment-16798385
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

{quote}
This looks to me that we end up bulk-loading the MOB file that we created, not 
the ref file? It's not clear to me what this is supposed to be doing.
{quote}
This is artifacts of some code cleaning/modifications. The fileName argument is 
not used at all in *bulkloadRefFile*method, but only bulkloadDir, where ref 
file is located.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-21 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798374#comment-16798374
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

{quote}
Another question: In the case where we successfully created the new MOB file 
but the bulk loading of the ref file failed, how is deleting the new mob file 
causing data loss? The bulk load should be atomic, right? (Either all files are 
loaded or no files are loaded) I would think that if the bulk load of the ref 
file failed, we would be safe to get rid of that new MOB file because we still 
have all of the old MOB files.
{quote}

Bulkload is atomic per region only. If you have file(s) which span(s) several 
regions - no atomicity for you. In the latter case, some files (or parts of 
files after splitting) can be loaded, some may fail. You get partially 
successful operation and, as a result, - data loss (MOB was deleted bc of a 
compaction failure), but some parts of a new reference file(s) were loaded.  

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-21 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798366#comment-16798366
 ] 

Josh Elser commented on HBASE-22075:


{quote}This sounds like a case where we ought to be using procedures?
{quote}
Yeah, sounds like this should eventually be rewritten.

I'm having a hard time wrapping my head around this code, [~vrodionov]. I'm 
hoping you can clarify some of what this is doing... Just looking at 
{{PartitionedMobCompactor#compactMobFilesInBatch}}, we open a writer for the 
MOB file and then the reference file for that MOB file. The thing I don't 
understand is that we open a writer for that ref file, but we don't appear to 
_do_ anythign with it:
{code:java}
    writer = MobUtils
    .createWriter(conf, fs, column, 
partition.getPartitionId().getLatestDate(), tempPath,
    Long.MAX_VALUE, column.getCompactionCompressionType(),
    partition.getPartitionId().getStartKey(), 
compactionCacheConfig, cryptoContext,
    true);
    cleanupTmpMobFile = true;
    filePath = writer.getPath();
...
    // create a temp file and open a writer for it in the bulkloadPath
    refFileWriter = MobUtils.createRefFileWriter(conf, fs, column, 
bulkloadColumnPath,
    fileInfo.getSecond().longValue(), compactionCacheConfig, 
cryptoContext, true);
...
  for (Cell cell : cells) {
    // write the mob cell to the mob file.
    writer.append(cell);
    // write the new reference cell to the store file.
    Cell reference = MobUtils.createMobRefCell(cell, fileName, 
this.refCellTags);
    refFileWriter.append(reference);
    mobCells++;
  }
...
    if (cleanupBulkloadDirOfPartition) {
  // append metadata and bulkload info to the ref mob file, and close 
the writer.
  closeRefFileWriter(refFileWriter, fileInfo.getFirst(), 
request.selectionTime);
    }
...
    // bulkload the ref file
    bulkloadRefFile(connection, table, bulkloadPathOfPartition, 
filePath.getName()){code}
This looks to me that we end up bulk-loading the MOB file that we created, not 
the ref file? It's not clear to me what this is supposed to be doing.

Another question: In the case where we successfully created the new MOB file 
but the bulk loading of the ref file failed, how is deleting the new mob file 
causing data loss? The bulk load should be atomic, right? (Either all files are 
loaded or no files are loaded) I would think that if the bulk load of the ref 
file failed, we would be safe to get rid of that new MOB file because we still 
have all of the old MOB files.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>  Components: mob
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-20 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797710#comment-16797710
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

Marked as Critical. All 2.0, 2.1, 2.2 version are affected. I will double check 
code again, but at least, Master orchestrated MOB compaction code is still in 
2.x. Not sure, if it is still used.

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.0, 2.0.1, 2.1.1, 2.0.2, 2.0.3, 2.1.2, 2.0.4, 
> 2.1.3
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Critical
>  Labels: mob
> Fix For: 2.2.0, 2.0.5, 2.1.4
>
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-20 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797704#comment-16797704
 ] 

Sean Busbey commented on HBASE-22075:
-

This sounds like a case where we ought to be using procedures?

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-20 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797701#comment-16797701
 ] 

Sean Busbey commented on HBASE-22075:
-

What are the affected version(s)? Sounds like this should be marked Critical?

> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22075) Potential data loss when MOB compaction fails

2019-03-20 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797644#comment-16797644
 ] 

Vladimir Rodionov commented on HBASE-22075:
---

The patch v1. The patch do the following:
# Increases number of bulk load retries (for MOB compaction only) to 1000, 
which can be overwritten
# Keeps newly created MOB file even if compaction fails. This will be taken 
care of next time MOB compaction will run.



> Potential data loss when MOB compaction fails
> -
>
> Key: HBASE-22075
> URL: https://issues.apache.org/jira/browse/HBASE-22075
> Project: HBase
>  Issue Type: Bug
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
>Priority: Major
> Attachments: HBASE-22075-v1.patch
>
>
> When MOB compaction fails during last step (bulk load of a newly created 
> reference file) there is a high chance of a data loss due to partially loaded 
> reference file, cells of which refer to (now) non-existent MOB file. The 
> newly created MOB file is deleted automatically in case of a MOB compaction 
> failure, but some cells with the references to this file might be loaded to 
> HBase. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)