[
https://issues.apache.org/jira/browse/HDDS-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890814#comment-16890814
]
Sammi Chen commented on HDDS-1751:
----------------------------------
Yes, it's fixed by HDDS-1713. I run "src/test/bin/start-chaos.sh" locally with
SCMContainerPlacementRackAware as the placement policy. Here is the log,
2019-07-23 16:57:38,336 INFO container.ReplicationManager
(ReplicationManager.java:handleUnderReplicatedContainer(489)) - Container #3 is
under replicated. Expected replica count is 3, but found 2.
2019-07-23 16:57:38,336 INFO container.ReplicationManager
(ReplicationManager.java:sendReplicateCommand(652)) - Sending replicate
container command for container #3 to datanode
e4635174-5f4b-4141-aea3-d994486370aa{ip: 127.0.0.1, host: vm-centos,
networkLocation: /default-rack, certSerialId: null}
2019-07-23 16:57:38,336 INFO container.ReplicationManager
(ReplicationManager.java:handleUnderReplicatedContainer(489)) - Container #9 is
under replicated. Expected replica count is 3, but found 2.
2019-07-23 16:57:38,336 INFO container.ReplicationManager
(ReplicationManager.java:sendReplicateCommand(652)) - Sending replicate
container command for container #9 to datanode
718da402-1433-4b44-8479-0a42e47929fd{ip: 127.0.0.1, host: vm-centos,
networkLocation: /default-rack, certSerialId: null}
2019-07-23 16:57:38,336 INFO container.ReplicationManager
(ReplicationManager.java:handleUnderReplicatedContainer(489)) - Container #10
is under replicated. Expected replica count is 3, but found 2.
2019-07-23 16:57:38,336 INFO container.ReplicationManager
(ReplicationManager.java:sendReplicateCommand(652)) - Sending replicate
container command for container #10 to datanode
718da402-1433-4b44-8479-0a42e47929fd{ip: 127.0.0.1, host: vm-centos,
networkLocation: /default-rack, certSerialId: null}
> replication of underReplicated container fails with
> SCMContainerPlacementRackAware policy
> -----------------------------------------------------------------------------------------
>
> Key: HDDS-1751
> URL: https://issues.apache.org/jira/browse/HDDS-1751
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: SCM
> Affects Versions: 0.4.0
> Reporter: Mukul Kumar Singh
> Assignee: Sammi Chen
> Priority: Major
> Labels: MiniOzoneChaosCluster
>
> SCM container replication fails with
> {code}
> 2019-07-02 18:26:41,564 WARN container.ReplicationManager
> (ReplicationManager.java:handleUnderReplicatedContainer(501)) - Exception
> while replicating container 18.
> org.apache.hadoop.hdds.scm.exceptions.SCMException: No enough datanodes to
> choose.
> at
> org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:100)
> at
> org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487)
> at
> org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293)
> at
> java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4649)
> at
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at
> org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205)
> at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]