[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2021-11-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449922#comment-17449922
 ] 

Hudson commented on HBASE-25053:


Results for branch branch-2.3
[build #317 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/317/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/317/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/317/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/317/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2.3/317/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: Yulin Niu
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.8
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2021-11-27 Thread Yulin Niu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449876#comment-17449876
 ] 

Yulin Niu commented on HBASE-25053:
---

[~zhangduo] Thanks for tips

> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: Yulin Niu
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.8
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2021-11-27 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449840#comment-17449840
 ] 

Duo Zhang commented on HBASE-25053:
---

I think branch-2.3 is already EOL? There will be no 2.3.8 release...

> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: Yulin Niu
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.8
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2021-11-27 Thread Yulin Niu (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449827#comment-17449827
 ] 

Yulin Niu commented on HBASE-25053:
---

Push to branch-2.3

> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: Yulin Niu
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0, 2.3.8
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2020-11-05 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226894#comment-17226894
 ] 

Hudson commented on HBASE-25053:


Results for branch branch-2
[build #94 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/94/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/94/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/94/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/94/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/94/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2020-11-05 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226708#comment-17226708
 ] 

Hudson commented on HBASE-25053:


Results for branch master
[build #117 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/117/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/117/General_20Nightly_20Build_20Report/]






(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/117/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/117/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: niuyulin
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.4.0
>
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25053) WAL replay should ignore 0-length files

2020-11-03 Thread niuyulin (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225808#comment-17225808
 ] 

niuyulin commented on HBASE-25053:
--

[~stack] mind help review this pr and see if this issue can be closed

> WAL replay should ignore 0-length files
> ---
>
> Key: HBASE-25053
> URL: https://issues.apache.org/jira/browse/HBASE-25053
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 2.3.1
>Reporter: Nick Dimiduk
>Assignee: niuyulin
>Priority: Major
>
> I overdrove a small testing cluster, filling HDFS. After cleaning up data to 
> bring HBase back up, I noticed all masters -refused to start- abort. Logs 
> complain of seeking past EOF. Indeed the last wal file name logged is a 
> 0-length file. WAL replay should gracefully skip and clean up such an empty 
> file.
> {noformat}
> 2020-09-16 19:51:30,297 ERROR org.apache.hadoop.hbase.master.HMaster: Failed 
> to become active master
> java.io.EOFException: Cannot seek after EOF
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1448)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
> at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
> at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4859)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4765)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1014)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:956)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7496)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7454)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
> at 
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:949)
> at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2240)
> at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:622)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)