[
https://issues.apache.org/jira/browse/HDDS-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16834389#comment-16834389
]
Mukul Kumar Singh commented on HDDS-1451:
-----------------------------------------
[~avijayan], yes you are absolutely correct. The problem is exactly the one you
have described. Some pipelines can be created between the point of check for
pipeline and point of allocating the pipeline and then the pipeline creation
will fail
> SCMBlockManager findPipeline and createPipeline are not lock protected
> ----------------------------------------------------------------------
>
> Key: HDDS-1451
> URL: https://issues.apache.org/jira/browse/HDDS-1451
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: SCM
> Affects Versions: 0.3.0
> Reporter: Mukul Kumar Singh
> Assignee: Aravindan Vijayan
> Priority: Major
> Labels: MiniOzoneChaosCluster
>
> SCM BlockManager may try to allocate pipelines in the cases when it is not
> needed. This happens because BlockManagerImpl#allocateBlock is not lock
> protected, so multiple pipelines can be allocated from it. One of the
> pipeline allocation can fail even when one of the existing pipeline already
> exists.
> {code}
> 2019-04-22 22:34:14,336 INFO pipeline.RatisPipelineProvider
> (RatisPipelineProvider.java:lambda$create$1(103)) - pipeline Pipeline[ Id:
> 6f4bb2d7-d660-4f9f-bc06-72b10f9a738e, Nodes: 76e1a493-fd55-4d67-9f5
> 5-c04fd6bd3a33{ip: 192.168.0.104, host: 192.168.0.104, certSerialId:
> null}2b9850b2-aed3-4a40-91b5-2447dc5246bf{ip: 192.168.0.104, host:
> 192.168.0.104, certSerialId: null}12248721-ea6a-453f-8dad-fc7fbe692f
> d2{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS,
> Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,386 INFO impl.RoleInfo
> (RoleInfo.java:shutdownLeaderElection(134)) -
> e17b7852-4691-40c7-8791-ad0b0da5201f: shutdown LeaderElection
> 2019-04-22 22:34:14,388 INFO pipeline.RatisPipelineProvider
> (RatisPipelineProvider.java:lambda$create$1(103)) - pipeline Pipeline[ Id:
> 552e28f3-98d9-41f3-86e0-c1b9494838a5, Nodes: e17b7852-4691-40c7-879
> 1-ad0b0da5201f{ip: 192.168.0.104, host: 192.168.0.104, certSerialId:
> null}fd365bac-e26e-4b11-afd8-9d08cd1b0521{ip: 192.168.0.104, host:
> 192.168.0.104, certSerialId: null}9583a007-7f02-4074-9e26-19bc18e29e
> c5{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS,
> Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,388 INFO impl.RoleInfo (RoleInfo.java:updateAndGet(143))
> - e17b7852-4691-40c7-8791-ad0b0da5201f: start FollowerState
> 2019-04-22 22:34:14,388 INFO pipeline.RatisPipelineProvider
> (RatisPipelineProvider.java:lambda$create$1(103)) - pipeline Pipeline[ Id:
> 5383151b-d625-4362-a7dd-c0d353acaf76, Nodes: 80f16ad6-3879-4a64-a3c
> 7-7719813cc139{ip: 192.168.0.104, host: 192.168.0.104, certSerialId:
> null}082ce481-7fb0-4f88-ac21-82609290a6a2{ip: 192.168.0.104, host:
> 192.168.0.104, certSerialId: null}dd5f5a70-0217-4577-b7a2-c42aa139d1
> 8a{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS,
> Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,389 INFO pipeline.RatisPipelineProvider
> (RatisPipelineProvider.java:lambda$create$1(103)) - pipeline Pipeline[ Id:
> be4854e5-7933-4caa-b32e-f482cf500247, Nodes: 6e2356f1-479d-498b-876
> a-1c90623c498b{ip: 192.168.0.104, host: 192.168.0.104, certSerialId:
> null}8ac46d94-9975-4eea-9448-2618c69d7bf3{ip: 192.168.0.104, host:
> 192.168.0.104, certSerialId: null}a3ed36a1-44ca-47b2-b9b3-5aeef04595
> 18{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS,
> Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,390 INFO pipeline.RatisPipelineProvider
> (RatisPipelineProvider.java:lambda$create$1(103)) - pipeline Pipeline[ Id:
> 21e368e2-f82a-4c61-9cc3-06e8de22ea6b, Nodes:
> 82632040-5754-4122-b187-331879586842{ip: 192.168.0.104, host: 192.168.0.104,
> certSerialId: null}923c8537-b869-4085-adcb-0a9accdcd089{ip: 192.168.0.104,
> host: 192.168.0.104, certSerialId:
> null}c6d790bf-e3a6-4064-acb5-f74796cd38a9{ip: 192.168.0.104, host:
> 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,390 INFO pipeline.RatisPipelineProvider
> (RatisPipelineProvider.java:lambda$create$1(103)) - pipeline Pipeline[ Id:
> cccbc2ed-e0e2-4578-a8a2-94f4b645be52, Nodes:
> 91ae6848-a778-43be-a4a1-5855f7adc0d8{ip: 192.168.0.104, host: 192.168.0.104,
> certSerialId: null}8f330a03-40e2-4bd1-9b43-5e05b13d89f0{ip: 192.168.0.104,
> host: 192.168.0.104, certSerialId:
> null}4f3070dc-650b-48d7-87b5-d2076104e7b4{ip: 192.168.0.104, host:
> 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
> 2019-04-22 22:34:14,392 ERROR block.BlockManagerImpl
> (BlockManagerImpl.java:allocateBlock(192)) - Pipeline creation failed for
> type:RATIS factor:THREE
> org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot
> create pipeline of factor 3 using 2 nodes 20 healthy nodes 20 all nodes.
> at
> org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.create(RatisPipelineProvider.java:122)
> at
> org.apache.hadoop.hdds.scm.pipeline.PipelineFactory.create(PipelineFactory.java:57)
> at
> org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.createPipeline(SCMPipelineManager.java:148)
> at
> org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:190)
> at
> org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:172)
> at
> org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:82)
> at
> org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:7533)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> 2019-04-22 22:34:14,395 ERROR block.BlockManagerImpl
> (BlockManagerImpl.java:allocateBlock(213)) - Unable to allocate a block for
> the size: 16384, type: RATIS, factor: THREE
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]