[ 
https://issues.apache.org/jira/browse/HDDS-11308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Juncevich reassigned HDDS-11308:
-------------------------------------

    Assignee:     (was: Alex Juncevich)

> ClassCastException during simultaneous putBlock operations on an EC container
> -----------------------------------------------------------------------------
>
>                 Key: HDDS-11308
>                 URL: https://issues.apache.org/jira/browse/HDDS-11308
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: EC, Ozone Datanode
>            Reporter: Ethan Rose
>            Priority: Major
>
> h2. Issue
> While writing EC data the following exception was observed:
> {code:java}
> 2024-07-30 10:42:22,547 WARN 
> [ChunkReader-8]-org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: 
> Operation: PutBlock , Trace ID:  , Message: java.lang.ClassCastException: 
> java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode , Result: 
> CONTAINER_INTERNAL_ERROR , StorageContainerException Occurred.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode
>       at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:228)
>       at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:328)
>       at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:176)
>       at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
>       at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:175)
>       at 
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:57)
>       at 
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:50)
>       at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262)
>       at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
>       at 
> org.apache.hadoop.hdds.tracing.GrpcServerInterceptor$1.onMessage(GrpcServerInterceptor.java:49)
>       at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:333)
>       at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:316)
>       at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:835)
>       at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>       at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:750)
> Caused by: java.lang.ClassCastException: java.util.HashMap$Node cannot be 
> cast to java.util.HashMap$TreeNode
>       at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1859)
>       at java.util.HashMap$TreeNode.putTreeVal(HashMap.java:2038)
>       at java.util.HashMap.putVal(HashMap.java:639)
>       at java.util.HashMap.put(HashMap.java:613)
>       at java.util.HashSet.add(HashSet.java:220)
>       at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.addToPendingPutBlockCache(KeyValueContainer.java:831)
>       at 
> org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.persistPutBlock(BlockManagerImpl.java:209)
>       at 
> org.apache.hadoop.ozone.container.keyvalue.impl.BlockManagerImpl.putBlock(BlockManagerImpl.java:103)
>       at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handlePutBlock(KeyValueHandler.java:551)
>       at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:254)
>       at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:225)
>       ... 17 more
> 2024-07-30 10:42:22,550 WARN 
> [ChunkReader-8]-org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer: 
> Moving container /data/...current/containerDir189/227982 to state UNHEALTHY 
> from state:OPEN
> {code}
> Container 227982 is an EC container, and Ozone was later able to restore all 
> healthy replicas using offline reconstruction.
> h2. Cause
> There is a putBlock cache for tracking updates to the block count metadata of 
> an open container. The cache is not thread safe so when two threads access it 
> at the same time we may get undefined behavior like the exception shown above.
> h3. Explanation
> The cache is implemented as a non-concurrent {{{}HashSet{}}}. The [code 
> comments acknowledge 
> this|https://github.com/apache/ozone/blob/98369a8343f12479af99db4c83696945d40c1e3d/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/KeyValueContainer.java#L112]
>  and assume only one instance of putBlock per container will run at a time. 
> This assumption that only one putBlock per container will run at a time is 
> also assumed by the metadata updates done by putBlock, which increment 
> counters in RocksDB and can have incorrect values if two updates happen at 
> the same time (time of check time of use, see HDDS-8129 causing issues like 
> HDDS-5359)
> These assumptions were probably in place before EC was added. For Ratis, the 
> {{ContainerStateMachine}} enforces that only one putBlock per container will 
> run at a time. For EC, however, there is no such enforcement. EC write 
> requests go through {{XceiverServerGrpc}} to {{GrpcXceiverService}} where 
> they are handled by a [thread 
> pool|https://github.com/apache/ozone/blob/9ba4a73f46156cae31c6c074290549fdb95e6f2e/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/XceiverServerGrpc.java#L149]
>  that assumes it is only handling reads. If two EC clients issue putBlock for 
> different blocks in the same container, we may see:
>  # Incorrect metadata counters.
>  # Exceptions like the one above that mark the container unhealthy and fail 
> the write.
> Based on inspection of the {{HashSet}} code, I believe this specific 
> exception will only occur if the two concurrent putBlock IDs hash to the same 
> bin in the set. This explains why the issue has existed for about 2 years but 
> have just seen it for the first time now.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to