[
https://issues.apache.org/jira/browse/HDDS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16852264#comment-16852264
]
Mukul Kumar Singh commented on HDDS-1613:
-----------------------------------------
This problem occurs because in the current container cache, if an eviction is
being requested on an entry with a reference, the entry is removed from the map
while reference is still held onto the rocksdb. This reference also has
acquired the lock as well.
When another consumer tries to fetch the rockdb from the cache, it does not
find the entry and now tries to acquire the lock, this step will fail as the
other reference is holding the lock.
> Opening of rocksDB in datanode fails with "No locks available"
> --------------------------------------------------------------
>
> Key: HDDS-1613
> URL: https://issues.apache.org/jira/browse/HDDS-1613
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Client
> Affects Versions: 0.4.0
> Reporter: Mukul Kumar Singh
> Assignee: Mukul Kumar Singh
> Priority: Major
> Labels: MiniOzoneChaosCluster
>
> Block read fails with
> {code}
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> Unable to find the block with bcsID 11777 .Container 68 bcsId is 0.
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:573)
> at
> org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:120)
> at
> org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.initializeBlockInputStream(KeyInputStream.java:295)
> at
> org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.getStream(KeyInputStream.java:265)
> at
> org.apache.hadoop.ozone.client.io.KeyInputStream$ChunkInputStreamEntry.access$000(KeyInputStream.java:229)
> at
> org.apache.hadoop.ozone.client.io.KeyInputStream.getStreamEntry(KeyInputStream.java:107)
> at
> org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:140)
> at
> org.apache.hadoop.ozone.client.io.OzoneInputStream.read(OzoneInputStream.java:47)
> at java.io.InputStream.read(InputStream.java:101)
> at
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.load(MiniOzoneLoadGenerator.java:114)
> at
> org.apache.hadoop.ozone.MiniOzoneLoadGenerator.lambda$startIO$0(MiniOzoneLoadGenerator.java:147)
> at
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Looking at the 3 datanodes, the containers are in bcs id of 11748, 11748 and
> 0.
> {code}
> 2019-05-30 08:28:05,348 INFO keyvalue.KeyValueHandler
> (ContainerUtils.java:logAndReturnError(146)) - Operation: GetBlock : Trace
> ID: 93a2a596076d2ee4:93a2a596076d2ee4:0:0 : Message: Unable to find the block
> with bcsID 11777 .Container 68 bcsId is 11748. : Result: UNKNOWN_BCSID
> 2019-05-30 08:28:05,363 INFO keyvalue.KeyValueHandler
> (ContainerUtils.java:logAndReturnError(146)) - Operation: GetBlock : Trace
> ID: 93a2a596076d2ee4:93a2a596076d2ee4:0:0 : Message: Unable to find the block
> with bcsID 11777 .Container 68 bcsId is 11748. : Result: UNKNOWN_BCSID
> 2019-05-30 08:28:05,377 INFO keyvalue.KeyValueHandler
> (ContainerUtils.java:logAndReturnError(146)) - Operation: GetBlock : Trace
> ID: 93a2a596076d2ee4:93a2a596076d2ee4:0:0 : Message: Unable to find the block
> with bcsID 11777 .Container 68 bcsId is 0. : Result: UNKNOWN_BCSID
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]