[ 
https://issues.apache.org/jira/browse/HDDS-9326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDDS-9326:
----------------------------------
    Description: 
Ran ycsb for extended period of time. Restart the cluster.

First of all, I am aware that HMaster doesn't start because Ozone DN takes a 
long time to replay Ratis transaction logs (due to the redundant metadata in 
PutBlock) and SCM was in safe mode.

 

But even after SCM gets out of safe mode, HMaster is still failing due to  
IllegalArgumentException: NO_REPLICA_FOUND
{noformat}
11:43:34.199 PM ERROR HMaster Failed to become active master
java.lang.IllegalArgumentException: NO_REPLICA_FOUND
        at 
org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
        at 
org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:170)
        at 
org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:163)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.acquireClient(BlockInputStream.java:285)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:238)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:146)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:308)
        at 
org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
        at 
org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:57)
        at 
org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:96)
        at 
org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
        at 
org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:64)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at org.apache.hadoop.hbase.util.FSUtils.getClusterId(FSUtils.java:526)
        at 
org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:267)
        at 
org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:135)
        at 
org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:112)
        at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:814)
        at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
        at java.lang.Thread.run(Thread.java:748) {noformat}

  was:
Ran ycsb for extended period of time. Restart the cluster.

First of all, I am aware that HMaster doesn't start because Ozone DN takes a 
long time to replay Ratis transaction logs (due to the redundant metadata in 
PutBlock) and SCM was in safe mode.

 

But even after SCM gets out of safe mode, HMaster is still failing due to  
IllegalArgumentException: NO_REPLICA_FOUND
{noformat}
11:43:34.199 PMERRORHMasterFailed to become active master
java.lang.IllegalArgumentException: NO_REPLICA_FOUND
        at 
org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
        at 
org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:170)
        at 
org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:163)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.acquireClient(BlockInputStream.java:285)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:238)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:146)
        at 
org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:308)
        at 
org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
        at 
org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:57)
        at 
org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:96)
        at 
org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
        at 
org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:64)
        at java.io.DataInputStream.readFully(DataInputStream.java:195)
        at java.io.DataInputStream.readFully(DataInputStream.java:169)
        at org.apache.hadoop.hbase.util.FSUtils.getClusterId(FSUtils.java:526)
        at 
org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:267)
        at 
org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:135)
        at 
org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:112)
        at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:814)
        at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
        at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
        at java.lang.Thread.run(Thread.java:748) {noformat}


> HBase HMaster unable to start due to IllegalArgumentException: 
> NO_REPLICA_FOUND
> -------------------------------------------------------------------------------
>
>                 Key: HDDS-9326
>                 URL: https://issues.apache.org/jira/browse/HDDS-9326
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> Ran ycsb for extended period of time. Restart the cluster.
> First of all, I am aware that HMaster doesn't start because Ozone DN takes a 
> long time to replay Ratis transaction logs (due to the redundant metadata in 
> PutBlock) and SCM was in safe mode.
>  
> But even after SCM gets out of safe mode, HMaster is still failing due to  
> IllegalArgumentException: NO_REPLICA_FOUND
> {noformat}
> 11:43:34.199 PM ERROR HMaster Failed to become active master
> java.lang.IllegalArgumentException: NO_REPLICA_FOUND
>       at 
> org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
>       at 
> org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:170)
>       at 
> org.apache.hadoop.hdds.scm.XceiverClientManager.acquireClientForReadData(XceiverClientManager.java:163)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.acquireClient(BlockInputStream.java:285)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.getChunkInfos(BlockInputStream.java:238)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.initialize(BlockInputStream.java:146)
>       at 
> org.apache.hadoop.hdds.scm.storage.BlockInputStream.readWithStrategy(BlockInputStream.java:308)
>       at 
> org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
>       at 
> org.apache.hadoop.hdds.scm.storage.ByteArrayReader.readFromBlock(ByteArrayReader.java:57)
>       at 
> org.apache.hadoop.hdds.scm.storage.MultipartInputStream.readWithStrategy(MultipartInputStream.java:96)
>       at 
> org.apache.hadoop.hdds.scm.storage.ExtendedInputStream.read(ExtendedInputStream.java:56)
>       at 
> org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:64)
>       at java.io.DataInputStream.readFully(DataInputStream.java:195)
>       at java.io.DataInputStream.readFully(DataInputStream.java:169)
>       at org.apache.hadoop.hbase.util.FSUtils.getClusterId(FSUtils.java:526)
>       at 
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:267)
>       at 
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:135)
>       at 
> org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:112)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:814)
>       at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2216)
>       at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:528)
>       at java.lang.Thread.run(Thread.java:748) {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to