[
https://issues.apache.org/jira/browse/HDDS-5785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mukul Kumar Singh resolved HDDS-5785.
-------------------------------------
Fix Version/s: 1.2.0
Resolution: Fixed
> containerCreationLock is a severe bottleneck when there are many containers
> to create
> -------------------------------------------------------------------------------------
>
> Key: HDDS-5785
> URL: https://issues.apache.org/jira/browse/HDDS-5785
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Sammi Chen
> Assignee: Sammi Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.2.0
>
>
> Here is the trace log of the tool Arthas. Most of the execution time is
> contributed by lock aquire. Please be noted that these logs are captured from
> different call stacks, the data only shows the trend.
> `---ts=2021-09-24
> 21:41:05;thread_name=201a016a-2966-40b3-b289-12413f6197b4@group-1998503DB7A6-SegmentedRaftLogWorker;id=157;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@6659c656
> `---[3193.757119ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$WriteLog:execute()
> +---[0.00279ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$800()
> #505
> +---[0.001601ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$StateMachineDataPolicy:isSync()
> #505
> +---[0.002194ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$800()
> #506
> +---[3193.626701ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker$StateMachineDataPolicy:getFromFuture()
> #506
> +---[0.003975ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$200()
> #509
> +---[0.004119ms]
> org.apache.ratis.server.metrics.SegmentedRaftLogMetrics:onRaftLogAppendEntry()
> #509
> +---[0.002571ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$900()
> #510
> +---[0.001707ms] org.apache.ratis.util.Preconditions:assertTrue() #510
> +---[0.002315ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$1000()
> #511
> +---[0.002079ms] org.apache.ratis.proto.RaftProtos$LogEntryProto:getIndex()
> #511
> +---[0.001338ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$1000()
> #512
> +---[0.00187ms] org.apache.ratis.util.Preconditions:assertTrue() #511
> +---[0.003476ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$900()
> #513
> +---[0.014489ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogOutputStream:write()
> #513
> +---[0.001518ms] org.apache.ratis.proto.RaftProtos$LogEntryProto:getIndex()
> #514
> +---[0.001904ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$1002()
> #514
> +---[0.001927ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$1108()
> #515
> `---[0.002058ms]
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:access$1200()
> #516
> `---ts=2021-09-26
> 10:47:31;thread_name=ChunkWriter-80-0;id=d7;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@6659c656
> `---[61594.581776ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:dispatchRequest()
> +---[0.001458ms] com.google.common.base.Preconditions:checkNotNull() #180
> +---[8.67E-4ms] org.slf4j.Logger:isTraceEnabled() #181
> +---[9.84E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getCmdType()
> #187
> +---[0.001111ms]
> org.apache.hadoop.ozone.container.common.helpers.ContainerCommandRequestPBHelper:getAuditAction()
> #186
> +---[0.001595ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getEventType()
> #188
> +---[0.002115ms]
> org.apache.hadoop.ozone.container.common.helpers.ContainerCommandRequestPBHelper:getAuditParams()
> #190
> +---[8.22E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getCmdType()
> #195
> +---[9.09E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getContainerID()
> #196
> +---[0.002376ms]
> org.apache.hadoop.ozone.container.common.helpers.ContainerMetrics:incContainerOpsMetrics()
> #197
> +---[0.003211ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getContainer()
> #198
> +---[0.001122ms]
> org.apache.hadoop.ozone.container.common.transport.server.ratis.DispatcherContext:getStage()
> #201
> +---[8.56E-4ms]
> org.apache.hadoop.ozone.container.common.transport.server.ratis.DispatcherContext:getStage()
> #205
> +---[0.010324ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:validateToken()
> #209
> +---[8.47E-4ms]
> org.apache.hadoop.ozone.container.common.transport.server.ratis.DispatcherContext:getStage()
> #220
> +---[8.62E-4ms]
> org.apache.hadoop.ozone.container.common.transport.server.ratis.DispatcherContext:getContainer2BCSIDMap()
> #224
> +---[0.00112ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getMissingContainerSet()
> #240
> +---[61593.306048ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:createContainer()
> #258
> +---[0.001499ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandResponseProto:getResult()
> #259
> +---[0.001059ms] com.google.common.base.Preconditions:checkArgument() #266
> +---[0.002173ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getContainer()
> #273
> +---[0.003009ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getContainerType()
> #284
> +---[0.001432ms] org.apache.hadoop.hdds.HddsUtils:isReadOnly() #295
> +---[0.002478ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:sendCloseContainerActionIfNeeded()
> #296
> +---[0.001382ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getHandler() #298
> +---[1.169763ms]
> org.apache.hadoop.ozone.container.common.interfaces.Handler:handle() #307
> +---[0.001698ms]
> org.apache.hadoop.ozone.container.common.helpers.ContainerMetrics:incContainerOpsLatencies()
> #309
> +---[0.00119ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandResponseProto:getResult()
> #323
> +---[0.001039ms] org.apache.hadoop.hdds.HddsUtils:isReadOnly() #329
> +---[0.001083ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:canIgnoreException()
> #329
> +---[0.001669ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:updateBCSID()
> #364
> `---[0.001459ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:audit() #365
> `---ts=2021-09-26
> 10:57:09;thread_name=ChunkWriter-63-0;id=c6;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@6659c656
> `---[40513.739685ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:createContainer()
> +---[0.001661ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CreateContainerRequestProto:newBuilder()
> #406
> +---[0.001753ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CreateContainerRequestProto$Builder:setContainerType()
> #409
> +---[0.002695ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:newBuilder()
> #412
> +---[0.002142ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:setCmdType()
> #413
> +---[0.001441ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getContainerID()
> #414
> +---[0.001299ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:setContainerID()
> #414
> +---[0.003278ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$CreateContainerRequestProto$Builder:build()
> #415
> +---[0.001737ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:setCreateContainer()
> #415
> +---[0.001148ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getPipelineID()
> #416
> +---[0.001226ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:setPipelineID()
> #416
> +---[0.002452ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getDatanodeUuid()
> #417
> +---[0.001146ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:setDatanodeUuid()
> #417
> +---[0.001105ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getTraceID()
> #418
> +---[0.001183ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:setTraceID()
> #418
> +---[0.00189ms]
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher:getHandler() #422
> +---[0.001769ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto$Builder:build()
> #423
> `---[40513.674318ms]
> org.apache.hadoop.ozone.container.common.interfaces.Handler:handle() #423
> `---ts=2021-09-26
> 11:00:18;thread_name=ChunkWriter-10-0;id=91;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@6659c656
> `---[52072.335521ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:handle()
> `---[52072.328455ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:dispatchRequest()
> #170
> `---ts=2021-09-26
> 11:02:09;thread_name=ChunkWriter-60-0;id=c3;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@6659c656
> `---[51435.670861ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:dispatchRequest()
> +---[8.21E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getCmdType()
> #178
> +---[4.32E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$Type:ordinal()
> #180
> `---[51435.663218ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:handleCreateContainer()
> #182
> `---ts=2021-09-26
> 11:03:50;thread_name=ChunkWriter-33-0;id=a8;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@6659c656
> `---[13057.606294ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:handleCreateContainer()
> +---[0.001429ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:hasCreateContainer()
> #239
> +---[0.001533ms] com.google.common.base.Preconditions:checkArgument() #248
> +---[8.95E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getContainerID()
> #250
> +---[0.002883ms]
> org.apache.hadoop.ozone.container.common.impl.ChunkLayOutVersion:getConfiguredVersion()
> #253
> +---[8.89E-4ms]
> org.apache.hadoop.hdds.protocol.datanode.proto.ContainerProtos$ContainerCommandRequestProto:getPipelineID()
> #255
> +---[0.001117ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:getDatanodeId()
> #256
> +---[0.002143ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerData:<init>() #256
> +---[0.002526ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer:<init>() #259
> +---[12661.234468ms] org.apache.hadoop.util.AutoCloseableLock:acquire() #263
> +---[0.004718ms]
> org.apache.hadoop.ozone.container.common.impl.ContainerSet:getContainer() #264
> +---[396.290691ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer:create() #265
> +---[0.006224ms]
> org.apache.hadoop.ozone.container.common.impl.ContainerSet:addContainer() #266
> +---[0.006072ms] org.apache.hadoop.util.AutoCloseableLock:close() #273
> +---[0.009352ms]
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler:sendICR() #279
> `---[0.002687ms]
> org.apache.hadoop.hdds.scm.protocolPB.ContainerCommandResponseBuilders:getSuccessResponse()
> #284
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]