[jira] [Commented] (HDDS-6830) EC: SCMContainerPlacementRackScatter#chooseDatanodes may choose less nodes than required in unknown cases.

Nilotpal Nandi (Jira) Sat, 11 Jun 2022 23:07:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17553220#comment-17553220
 ]


Nilotpal Nandi commented on HDDS-6830:
--------------------------------------

[~umamaheswararao]  -  Please find below stacktrace . Let me know if this is 
the one you were asking for.

 
{noformat}
2022-06-08 19:57:51,643 ERROR 
org.apache.hadoop.hdds.scm.pipeline.WritableECContainerProvider: Unable to 
allocate a container for EC/ECReplicationConfig{data=4, parity=2, 
ecChunkSize=104448, codec=rs} after trying all existing containers
org.apache.hadoop.hdds.scm.exceptions.SCMException: Nodes size= 5, replication 
factor= 6 do not match
        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineFactory.checkPipeline(PipelineFactory.java:106)
        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineFactory.create(PipelineFactory.java:90)
        at 
org.apache.hadoop.hdds.scm.pipeline.PipelineManagerImpl.createPipeline(PipelineManagerImpl.java:195)
        at 
org.apache.hadoop.hdds.scm.pipeline.WritableECContainerProvider.allocateContainer(WritableECContainerProvider.java:172)
        at 
org.apache.hadoop.hdds.scm.pipeline.WritableECContainerProvider.getContainer(WritableECContainerProvider.java:155)
        at 
org.apache.hadoop.hdds.scm.pipeline.WritableECContainerProvider.getContainer(WritableECContainerProvider.java:51)
        at 
org.apache.hadoop.hdds.scm.pipeline.WritableContainerFactory.getContainer(WritableContainerFactory.java:59)
        at 
org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:176)
        at 
org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:194)
        at 
org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:192)
        at 
org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocolServerSideTranslatorPB.processMessage(ScmBlockLocationProtocolServerSideTranslatorPB.java:142)
        at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
        at 
org.apache.hadoop.hdds.scm.protocol.ScmBlockLocationProtocolServerSideTranslatorPB.send(ScmBlockLocationProtocolServerSideTranslatorPB.java:113)
        at 
org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:14202)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat}

> EC: SCMContainerPlacementRackScatter#chooseDatanodes may choose less nodes 
> than required in unknown cases.
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-6830
>                 URL: https://issues.apache.org/jira/browse/HDDS-6830
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Nilotpal Nandi
>            Assignee: Attila Doroszlai
>            Priority: Minor
>
> This issue may be shaded because we have a sanity check at the end of 
> chooseDatanodes method and throw SCM Exception. However, it fail in the above 
> checks itself if it cannot choose enough nodes. With the sanity check the 
> following condition we may not meet, but we can still investigate why 
> SCMContainerPlacementRackScatter giving up silently without exception. 
> {code:java}
> 2022-05-19 20:15:35,536 ERROR org.apache.ratis.statemachine.StateMachine: 
> Terminating with exit status 1: Nodes size=4, replication factor=5 do not 
> match java.lang.IllegalArgumentException: Nodes size=4, replication factor=5 
> do not match at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:303) at 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateMap.addPipeline(PipelineStateMap.java:74)
>  at 
> org.apache.hadoop.hdds.scm.pipeline.PipelineStateManagerImpl.addPipeline(PipelineStateManagerImpl.java:99)
>  at jdk.internal.reflect.GeneratedMethodAccessor222.invoke(Unknown Source) at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
> org.apache.hadoop.hdds.scm.ha.SCMStateMachine.process(SCMStateMachine.java:168)
>  at 
> org.apache.hadoop.hdds.scm.ha.SCMStateMachine.applyTransaction(SCMStateMachine.java:139)
>  at 
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1723)
>  at 
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:234)
>  at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:179)
>  at java.base/java.lang.Thread.run(Thread.java:834){code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-6830) EC: SCMContainerPlacementRackScatter#chooseDatanodes may choose less nodes than required in unknown cases.

Reply via email to