[
https://issues.apache.org/jira/browse/HBASE-1436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack resolved HBASE-1436.
--------------------------
Resolution: Fixed
Committed the below:
{code}
Index: src/java/org/apache/hadoop/hbase/regionserver/Store.java
===================================================================
--- src/java/org/apache/hadoop/hbase/regionserver/Store.java (revision
777167)
+++ src/java/org/apache/hadoop/hbase/regionserver/Store.java (working copy)
@@ -356,7 +356,15 @@
LOG.warn("Skipping " + p + " because its empty. HBASE-646 DATA LOSS?");
continue;
}
- StoreFile curfile = new StoreFile(fs, p);
+ StoreFile curfile = null;
+ try {
+ curfile = new StoreFile(fs, p);
+ } catch (IOException ioe) {
+ LOG.warn("Failed open of " + p + "; presumption is that file was " +
+ "corrupted at flush and lost edits picked up by commit log replay. "
+
+ "Verify!", ioe);
+ continue;
+ }
long storeSeqId = curfile.getMaxSequenceId();
if (storeSeqId > this.maxSeqId) {
this.maxSeqId = storeSeqId;
{code}
We just keep going logging the corrupted file
> Killing regionserver can make corrupted hfile
> ---------------------------------------------
>
> Key: HBASE-1436
> URL: https://issues.apache.org/jira/browse/HBASE-1436
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Fix For: 0.20.0
>
>
> Testing sync patch I've been killing HRS. Its pretty easy making corrupt
> hfile doing this:
> {code}
> 2009-05-18 23:00:42,889 [regionserver/0:0:0:0:0:0:0:0:60021.worker] ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening
> TestTable,0651512447,1242687355411
> java.io.IOException: Trailer 'header' is wrong; does the trailer size match
> content?
> at
> org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1289)
> at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:799)
> at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:744)
> at
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:217)
> at
> org.apache.hadoop.hbase.regionserver.StoreFile.<init>(StoreFile.java:107)
> at
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:359)
> at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:206)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1839)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:290)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1556)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1527)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1442)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> This issue is about just removing the corrupted store file and moving on.
> Currently region can't open because we keep getting above exception. Should
> also make sure that its safe to just remove, that the replay of the HRS log
> files will have the content of memcache that failed persisting.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.