[
https://issues.apache.org/jira/browse/HDDS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aravindan Vijayan updated HDDS-1625:
------------------------------------
Description:
While performing a scale test, we had to start a new OM instance (with new
UUID) to talk to an existing SCM with 15 pipelines and lots of existing
containers. When we tried to write to the new OM instance, all writes failed
with the following error.
{code}
2019-05-31 13:53:16,808 WARN
org.apache.hadoop.hdds.scm.container.SCMContainerManager: Container allocation
failed for pipeline=Pipeline[ Id: 3e0eec4d-67d1-4582-a9e9-e68b0a340de6, Nodes:
abaea3d2-a8c1-47de-8cdb-7cc5ed8f23a6{ip: 10.17.219.50, host: v
c1340.halxg.cloudera.com, certSerialId: null}, Type:RATIS, Factor:ONE,
State:OPEN] requiredSize=268435456 {}
java.util.ConcurrentModificationException
at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211)
at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265)
at
org.apache.hadoop.hdds.scm.container.SCMContainerManager.getContainersForOwner(SCMContainerManager.java:473)
at
org.apache.hadoop.hdds.scm.container.SCMContainerManager.getMatchingContainer(SCMContainerManager.java:394)
at
org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:203)
at
org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:172)
{code}
was:
While performing a scale test, we had to start a new OM instance (with new
UUID) to talk to an existing SCM with 15 pipelines and lots of existing
containers. When we tried to write to the new OM instance, all writes failed
with the following error.
{code}
{code}
> Writes fail when a new OM instance (namespace) is configured to work with an
> existing SCM.
> ------------------------------------------------------------------------------------------
>
> Key: HDDS-1625
> URL: https://issues.apache.org/jira/browse/HDDS-1625
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Aravindan Vijayan
> Priority: Critical
> Fix For: 0.5.0
>
>
> While performing a scale test, we had to start a new OM instance (with new
> UUID) to talk to an existing SCM with 15 pipelines and lots of existing
> containers. When we tried to write to the new OM instance, all writes failed
> with the following error.
> {code}
> 2019-05-31 13:53:16,808 WARN
> org.apache.hadoop.hdds.scm.container.SCMContainerManager: Container
> allocation failed for pipeline=Pipeline[ Id:
> 3e0eec4d-67d1-4582-a9e9-e68b0a340de6, Nodes:
> abaea3d2-a8c1-47de-8cdb-7cc5ed8f23a6{ip: 10.17.219.50, host: v
> c1340.halxg.cloudera.com, certSerialId: null}, Type:RATIS, Factor:ONE,
> State:OPEN] requiredSize=268435456 {}
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1211)
> at java.util.TreeMap$KeyIterator.next(TreeMap.java:1265)
> at
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.getContainersForOwner(SCMContainerManager.java:473)
> at
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.getMatchingContainer(SCMContainerManager.java:394)
> at
> org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:203)
> at
> org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:172)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]