[
https://issues.apache.org/jira/browse/HDDS-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang resolved HDDS-11012.
------------------------------------
Resolution: Duplicate
Fixed by HDDS-11014
> [Hbase-Ozone] HMaster down with NO_REPLICA_FOUND causing
> "CorruptHFileException: Problem reading HFile Trailer"
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HDDS-11012
> URL: https://issues.apache.org/jira/browse/HDDS-11012
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Pratyush Bhatt
> Priority: Blocker
>
> Both the HMasters are abruptly down with {_}IllegalArgumentException:
> NO_REPLICA_FOUND{_}.
> causing _"CorruptHFileException: Problem reading HFile Trailer from file"_
> *Stack Trace:*
> {code:java}
> 2024-06-13 02:57:51,744 ERROR org.apache.hadoop.hbase.master.HMaster: Failed
> to become active master
> java.io.IOException: java.io.IOException:
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile
> Trailer from file
> ofs://ozone1717496222/volhbase-new07062024/buckethbase-1717572506/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/proc/91207977e6d74ba2ba6a564570832563
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1144)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1087)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:990)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:940)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7904)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7861)
> at
> org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:307)
> at
> org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:424)
> at
> org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:122)
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
> at
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
> at
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException:
> org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile
> Trailer from file
> ofs://ozone1717496222/volhbase-new07062024/buckethbase-1717572506/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/proc/91207977e6d74ba2ba6a564570832563
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.openStoreFiles(StoreEngine.java:284)
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.initialize(StoreEngine.java:334)
> at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:306)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:6365)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1110)
> at
> org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:1107)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ... 1 more
> Caused by: org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem
> reading HFile Trailer from file
> ofs://ozone1717496222/volhbase-new07062024/buckethbase-1717572506/hbase/MasterData/data/master/store/1595e783b53d99cd5eef43b6debb2682/proc/91207977e6d74ba2ba6a564570832563
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:349)
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.<init>(HFileInfo.java:123)
> at
> org.apache.hadoop.hbase.regionserver.StoreFileInfo.initHFileInfo(StoreFileInfo.java:706)
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.open(HStoreFile.java:364)
> at
> org.apache.hadoop.hbase.regionserver.HStoreFile.initReader(HStoreFile.java:485)
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.createStoreFileAndReader(StoreEngine.java:224)
> at
> org.apache.hadoop.hbase.regionserver.StoreEngine.lambda$openStoreFiles$0(StoreEngine.java:262)
> ... 6 more
> Caused by: java.lang.IllegalArgumentException: NO_REPLICA_FOUND
> at
> org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
> at
> org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:180)
> at
> org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:161)
> at
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.acquireClient(BlockInputStream.java:342)
> at
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.getBlockData(BlockInputStream.java:258)
> at
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:164)
> at
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:370)
> at
> org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
> at
> org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:54)
> at
> org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:96)
> at
> org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
> at
> org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:81)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at
> org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:394)
> at
> org.apache.hadoop.hbase.io.hfile.HFileInfo.initTrailerAndContext(HFileInfo.java:339)
> ... 12 more
> 2024-06-13 02:57:51,745 ERROR org.apache.hadoop.hbase.master.HMaster: *****
> ABORTING master vc0121.xyz.com,22001,1718272586518: Unhandled exception.
> Starting shutdown. ***** {code}
> cc: [~ashishk] [~Sammi] [~weichiu]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]