[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2019-10-15 Thread HBase QA (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952464#comment-16952464
 ] 

HBase QA commented on HBASE-12125:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HBASE-12125 does not apply to master. Rebase required? Wrong 
Branch? See 
https://yetus.apache.org/documentation/in-progress/precommit-patchnames for 
help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-12125 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895999/HBASE-12125.v4.master.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/957/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.11.0 https://yetus.apache.org |


This message was automatically generated.



> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: hbck, hbck2, Replication
>Affects Versions: 3.0.0, 2.3.0, 1.6.0, hbase-operator-tools-1.1.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Critical
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch, 
> HBASE-12125.v4.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-29 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271568#comment-16271568
 ] 

Vincent Poon commented on HBASE-12125:
--

This patch only adds replication fixes to hbck , which are independent of AMv2. 
 So I think this part of hbck should still be valid.

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch, 
> HBASE-12125.v4.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238756#comment-16238756
 ] 

Hadoop QA commented on HBASE-12125:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} hbase-replication: The patch generated 0 new + 8 
unchanged - 4 fixed = 8 total (was 12) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} hbase-server: The patch generated 0 new + 189 
unchanged - 2 fixed = 189 total (was 191) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
47m 12s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
13s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}119m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-12125 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895999/HBASE-12125.v4.master.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5a1b808294bb 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238284#comment-16238284
 ] 

stack commented on HBASE-12125:
---

[~churromorales] HBCK presumes how assignment works. It also messes w/ hbase 
privates. In hbase2, assignment has been redone such that the Master's 
in-memory view is definitive -- no more state distributed over fs, zk, and 
master -- and Master effects any or all change. Also Master internals have 
changed. HBCK at a minimum no longer works and at worse, can actually do 
damage. TODO is an HBCK2. Shout if you need more detail sir.

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-03 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238272#comment-16238272
 ] 

Andrew Purtell commented on HBASE-12125:


bq. Is there a reason all the fsck tests are ignored after the ProcedureV2 
patch went in?

They were disabled for AMv2 actually. A lot of HBCK actions are not appropriate 
for AMv2. [~stack] can say more.

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-03 Thread churro morales (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16238224#comment-16238224
 ] 

churro morales commented on HBASE-12125:


Is there a reason all the fsck tests are ignored after the ProcedureV2 patch 
went in? 

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236993#comment-16236993
 ] 

Ted Yu commented on HBASE-12125:


>From the QA run:
{code}
[ERROR] 
testFixMissingReplicationWAL(org.apache.hadoop.hbase.util.TestHBaseFsckReplication)
  Time elapsed: 54.85 s  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.util.TestHBaseFsckReplication.testFixMissingReplicationWAL(TestHBaseFsckReplication.java:184)
{code}
which was almost identical to the error I reported yesterday.

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236990#comment-16236990
 ] 

Hadoop QA commented on HBASE-12125:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
44s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} hbase-replication: The patch generated 0 new + 8 
unchanged - 4 fixed = 8 total (was 12) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 1 new + 189 
unchanged - 2 fixed = 190 total (was 191) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
12s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
52m 45s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
14s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}126m 19s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}203m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.util.TestHBaseFsckReplication |
|   | hadoop.hbase.security.access.TestCoprocessorWhitelistMasterObserver |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-12125 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895502/HBASE-12125.v3.master.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 207597ce9dda 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 

[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-02 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236859#comment-16236859
 ] 

Vincent Poon commented on HBASE-12125:
--

[~tedyu] I get the "java.lang.NoSuchMethodError: 
org.eclipse.jetty.server.session.SessionHandler.getSessionManager()" from 
HADOOP-14930 , how did you get around that?

I used your command options and added "-Djetty.version=9.3.19.v20170502" and 
the test passed for me.  Thanks!

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236662#comment-16236662
 ] 

Ted Yu commented on HBASE-12125:


I use the following command options:
{code}
-Phadoop-3.0 -Dhadoop-three.version=3.0.0-beta1 -Dhadoop-two.version=3.0.0-beta1
{code}

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-02 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236644#comment-16236644
 ] 

Mike Drob commented on HBASE-12125:
---

You can build against hadoop3 by specifying {{-Dhadoop.profile=3}} in you maven 
command line. I think that will get you 3.0.0-alpha4.

You can also optionally specify {{-Dhadoop-three.version=3.0.0-beta1}}

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, 
> HBASE-12125.v2.master.patch, HBASE-12125.v3.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-02 Thread Vincent Poon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16236634#comment-16236634
 ] 

Vincent Poon commented on HBASE-12125:
--

[~tedyu] How do I test against hadoop3 - do I just change pom.xml 
"" to "${hadoop-three.version}" ?  That passed for me.

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, HBASE-12125.v2.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234982#comment-16234982
 ] 

Ted Yu commented on HBASE-12125:


Running new test against hadoop3 beta1, I got:
{code}
testFixMissingReplicationWAL(org.apache.hadoop.hbase.util.TestHBaseFsckReplication)
  Time elapsed: 49.211 sec  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.util.TestHBaseFsckReplication.testFixMissingReplicationWAL(TestHBaseFsckReplication.java:184)
{code}
See if the above can be reproduced on hadoop2

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>Priority: Major
> Attachments: HBASE-12125.v1.master.patch, HBASE-12125.v2.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233773#comment-16233773
 ] 

Hadoop QA commented on HBASE-12125:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
14s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} hbase-replication: The patch generated 5 new + 9 
unchanged - 3 fixed = 14 total (was 12) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
3s{color} | {color:red} hbase-server: The patch generated 23 new + 189 
unchanged - 2 fixed = 212 total (was 191) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  2m 
59s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
42m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
15s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 88m 
10s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
37s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-12125 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895082/HBASE-12125.v1.master.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9f0908b24d09 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git 

[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-10-31 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227696#comment-16227696
 ] 

Ted Yu commented on HBASE-12125:


Can you put the patch on review board ?

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
> Attachments: HBASE-12125.v1.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2015-09-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732053#comment-14732053
 ] 

Andrew Purtell commented on HBASE-12125:


For 0.98 you mean?
If so, yes that branch is accepting improvements. 

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Virag Kothari
>Assignee: Virag Kothari
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2015-09-04 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14731652#comment-14731652
 ] 

Mikhail Antonov commented on HBASE-12125:
-

bq. I have a 0.98 patch for this. Will put for review.

Apparently the code base have moved forward since then, but curious if this 
patch still would be relevant and useful here?


> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Virag Kothari
>Assignee: Virag Kothari
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2014-10-01 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154441#comment-14154441
 ] 

Virag Kothari commented on HBASE-12125:
---

A WAL roll on region server would be required only if the current WAL (WAL 
being written to) is corrupted. So fixCorruptedReplicationWAL can be useful if 
we know that the current WAL being written to is ok.

 Add Hbck option to check and fix WAL's from replication queue
 -

 Key: HBASE-12125
 URL: https://issues.apache.org/jira/browse/HBASE-12125
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Virag Kothari
Assignee: Virag Kothari

 The replication source will discard the WAL file in many cases when it 
 encounters an exception reading it . This can cause data loss
 and the underlying reason of failed read remains hidden.  Only in certain 
 scenarios, the replication source should dump the current WAL and move to the 
 next one. 
 This JIRA aims to have an hbck option to check the WAL files of replication 
 queues for any inconsistencies and also provide an option to fix it.
 The fix can be to remove the file from replication queue in zk and from the 
 memory of replication source manager and replication sources. 
 A region server endpoint call from the hbck client to region server can be 
 used to achieve this.
 Hbck can be configured with the following options:
 -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
 currently read by replication source) from replication queue. If there is a 
 position associated, it also seeks to that position and reads an entry from 
 there
 -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
 reading them completely to make sure they are ok.
 -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
 not present on hdfs
 -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
 are corrupted (based on the findings from softCheck/hardCheck). Also the 
 WAL's are moved to a quarantine dir
 -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
 first rolled over and then deals with it in the same way as 
 -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2014-09-30 Thread Virag Kothari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153884#comment-14153884
 ] 

Virag Kothari commented on HBASE-12125:
---

I have a 0.98 patch for this. Will put for review.

 Add Hbck option to check and fix WAL's from replication queue
 -

 Key: HBASE-12125
 URL: https://issues.apache.org/jira/browse/HBASE-12125
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Virag Kothari
Assignee: Virag Kothari

 The replication source will discard the WAL file in many cases when it 
 encounters an exception reading it . This can cause data loss
 and the underlying reason of failed read remains hidden.  Only in certain 
 scenarios, the replication source should dump the current WAL and move to the 
 next one. 
 This JIRA aims to have an hbck option to check the WAL files of replication 
 queues for any inconsistencies and also provide an option to fix it.
 The fix can be to remove the file from replication queue in zk and from the 
 memory of replication source manager and replication sources. 
 A region server endpoint call from the hbck client to region server can be 
 used to achieve this.
 Hbck can be configured with the following options:
 -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
 currently read by replication source) from replication queue. If there is a 
 position associated, it also seeks to that position and reads an entry from 
 there
 -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
 reading them completely to make sure they are ok.
 -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
 not present on hdfs
 -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
 are corrupted (based on the findings from softCheck/hardCheck). Also the 
 WAL's are moved to a quarantine dir
 -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
 first rolled over and then deals with it in the same way as 
 -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2014-09-30 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153936#comment-14153936
 ] 

Andrew Purtell commented on HBASE-12125:


{quote}
-fixCorruptedReplicationWAL: Remove the WAL's from replication queues which are 
corrupted (based on the findings from softCheck/hardCheck). Also the WAL's are 
moved to a quarantine dir
-rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
first rolled over and then deals with it in the same way as 
-fixCorruptedReplicationWAL option
{quote}
When would we want fixCorruptedReplicationWAL instead of 
rollAndFixCorruptedReplicationWAL?

 Add Hbck option to check and fix WAL's from replication queue
 -

 Key: HBASE-12125
 URL: https://issues.apache.org/jira/browse/HBASE-12125
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Virag Kothari
Assignee: Virag Kothari

 The replication source will discard the WAL file in many cases when it 
 encounters an exception reading it . This can cause data loss
 and the underlying reason of failed read remains hidden.  Only in certain 
 scenarios, the replication source should dump the current WAL and move to the 
 next one. 
 This JIRA aims to have an hbck option to check the WAL files of replication 
 queues for any inconsistencies and also provide an option to fix it.
 The fix can be to remove the file from replication queue in zk and from the 
 memory of replication source manager and replication sources. 
 A region server endpoint call from the hbck client to region server can be 
 used to achieve this.
 Hbck can be configured with the following options:
 -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
 currently read by replication source) from replication queue. If there is a 
 position associated, it also seeks to that position and reads an entry from 
 there
 -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
 reading them completely to make sure they are ok.
 -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
 not present on hdfs
 -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
 are corrupted (based on the findings from softCheck/hardCheck). Also the 
 WAL's are moved to a quarantine dir
 -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
 first rolled over and then deals with it in the same way as 
 -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)