[jira] [Updated] (HDDS-6614) EC: Fix Datanode block file INCONSISTENCY during heavy load.

ASF GitHub Bot (Jira) Wed, 20 Apr 2022 01:28:03 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated HDDS-6614:
---------------------------------
    Labels: pull-request-available  (was: )

> EC: Fix Datanode block file INCONSISTENCY during heavy load.
> ------------------------------------------------------------
>
>                 Key: HDDS-6614
>                 URL: https://issues.apache.org/jira/browse/HDDS-6614
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Mark Gui
>            Assignee: Mark Gui
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: FileChannelSize.java
>
>
> We are stressing an Ozone cluster with heavy EC writes and reads, and we get 
> one and only one exception from one DN:
> {code:java}
> 2022-04-14 15:02:04,422 [ChunkReader-46] INFO 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler: Operation: 
> WriteChunk , Trace ID:  , Message: Chunk file offset 1048576 does not match 
> blockFile length 0 , Result: CHUNK_FILE_INCONSISTENCY , 
> StorageContainerException Occurred.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Chunk file offset 1048576 does not match blockFile length 0
>         at 
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.validateChunkSize(ChunkUtils.java:391)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.impl.FilePerBlockStrategy.writeChunk(FilePerBlockStrategy.java:140)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerDispatcher.writeChunk(ChunkManagerDispatcher.java:73)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleWriteChunk(KeyValueHandler.java:705)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:224)
>         at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:188)
>         at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:307)
>         at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$14(HddsDispatcher.java:169)
>         at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
>         at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:168)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:57)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:50)
>         at 
> org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:255)
>         at 
> org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
>         at 
> org.apache.hadoop.hdds.tracing.GrpcServerInterceptor$1.onMessage(GrpcServerInterceptor.java:49)
>         at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
>         at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
>         at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
>         at 
> org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>         at 
> org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2022-04-14 15:02:04,441 [ChunkReader-46] WARN 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer: Moving 
> container 
> /data20/ozone/hdds/CID-ea6bec06-3f65-4721-a884-b3b9293f6a55/current/containerDir1/614
>  to state UNHEALTHY from state:UNHEALTHY 
> Trace:java.lang.Thread.getStackTrace(Thread.java:1559) {code}
> And this results in an UNHEALTHY container replica:
> {code:java}
> [ozoneadmin@TENCENT64 ~/ozone-1.3.0-SNAPSHOT]$ ./bin/ozone admin container 
> info 614
> Container id: 614
> Pipeline id: bddd5494-af98-46b2-a4c3-6757f129d3d2
> Container State: CLOSED
> Datanodes: [...]
> Replicas: [State: CLOSED; ReplicaIndex: 7; Origin: 
> d4a6d2fb-3cc3-4e31-a9a6-7c3cda56c7ad; Location: 
> d4a6d2fb-3cc3-4e31-a9a6-7c3cda56c7ad/9.27.141.78,
> State: CLOSED; ReplicaIndex: 10; Origin: 
> 45784b6a-06d7-4d76-b53e-1bc10a7e22ea; Location: 
> 45784b6a-06d7-4d76-b53e-1bc10a7e22ea/9.27.142.97,
> State: CLOSED; ReplicaIndex: 9; Origin: 65024339-4831-43c0-9fc0-752f8999d9c4; 
> Location: 65024339-4831-43c0-9fc0-752f8999d9c4/9.27.140.135,
> State: CLOSED; ReplicaIndex: 8; Origin: e963db9e-44ea-425c-92e3-bdb5305c3049; 
> Location: e963db9e-44ea-425c-92e3-bdb5305c3049/9.27.142.51,
> State: CLOSED; ReplicaIndex: 6; Origin: b6115b17-827c-484f-b61a-5f5849c5dbdb; 
> Location: b6115b17-827c-484f-b61a-5f5849c5dbdb/9.27.141.121,
> State: UNHEALTHY; ReplicaIndex: 12; Origin: 
> ae31f944-b6c9-4e18-b677-0b0edb2a8e13; Location: 
> ae31f944-b6c9-4e18-b677-0b0edb2a8e13/9.27.141.78,
> State: CLOSED; ReplicaIndex: 2; Origin: 4a6af5ae-097a-4daa-9390-b0fbe87446f1; 
> Location: 4a6af5ae-097a-4daa-9390-b0fbe87446f1/9.27.144.159,
> State: CLOSED; ReplicaIndex: 3; Origin: 3c3bc79e-3b5f-4f73-b502-7778a7993db6; 
> Location: 3c3bc79e-3b5f-4f73-b502-7778a7993db6/9.27.140.38,
> State: CLOSED; ReplicaIndex: 14; Origin: 
> 37db964c-0663-4b7c-915c-eaffccc32f55; Location: 
> 37db964c-0663-4b7c-915c-eaffccc32f55/9.27.142.51,
> State: CLOSED; ReplicaIndex: 11; Origin: 
> dc262c7c-184e-4f66-af98-2ece21deeab1; Location: 
> dc262c7c-184e-4f66-af98-2ece21deeab1/9.27.140.222,
> State: CLOSED; ReplicaIndex: 13; Origin: 
> e7ea47c7-6d33-490a-9d3a-3c8231c81e30; Location: 
> e7ea47c7-6d33-490a-9d3a-3c8231c81e30/9.27.144.169,
> State: CLOSED; ReplicaIndex: 1; Origin: 8333d554-2e39-4ef6-b73a-01539e82a9d1; 
> Location: 8333d554-2e39-4ef6-b73a-01539e82a9d1/9.27.144.226,
> State: CLOSED; ReplicaIndex: 4; Origin: 674aa6fb-4f27-4358-b7a7-af2bbf2683ae; 
> Location: 674aa6fb-4f27-4358-b7a7-af2bbf2683ae/9.27.141.230,
> State: CLOSED; ReplicaIndex: 5; Origin: 0d989a0e-3bb4-49ea-bacc-d468729bd9fa; 
> Location: 0d989a0e-3bb4-49ea-bacc-d468729bd9fa/9.27.140.135] {code}
> From the log, we could see that a block file received a write request at 
> offset 1048576 while the block file is of length 0.
> This should be considered insane because EC client write stripes one by one, 
> it doesn't send the next stripe chunks to DNs before we get success responses 
> from DNs for the previous stripe.
> And I checked the container directory for the block file, it has length 
> 1048576 exactly, not 0:
> {code:java}
> [ozoneadmin@TENCENT64 ~]$ ls -t 
> /data20/ozone/hdds/CID-ea6bec06-3f65-4721-a884-b3b9293f6a55/current/containerDir1/614/chunks/
>  -l
> total 1468508
> -rw-rw-r-- 1 ozoneadmin ozoneadmin  40894464 Apr 14 15:02 
> 109611004723334158.block
> -rw-rw-r-- 1 ozoneadmin ozoneadmin  41943040 Apr 14 15:02 
> 109611004723334160.block
> -rw-rw-r-- 1 ozoneadmin ozoneadmin  39845888 Apr 14 15:02 
> 109611004723334157.block
> -rw-rw-r-- 1 ozoneadmin ozoneadmin   1048576 Apr 14 15:02 
> 109611004723334165.block       <-- the target block file {code}
> So I suspect that the problem lies in the check of the DN itself.
> {code:java}
> public static void validateChunkSize(File chunkFile, ChunkInfo chunkInfo)
>     throws StorageContainerException {
>   long offset = chunkInfo.getOffset();
>   long len = chunkFile.length();
>   if (chunkFile.length() != offset) {
>     throw new StorageContainerException(
>         "Chunk file offset " + offset + " does not match blockFile length " +
>         len, CHUNK_FILE_INCONSISTENCY);
>   }
> } {code}
> Here we get the length of the existing block file by using 
> `chunkFile.length()` which is suspicious since we do write to block files by 
> using FileChannels, so I think we should get the file size by the channel too 
> because the channel may haven't create the block file yet when we do 
> `chunkFile.length()`, thus we may get a 0. This may be the cause of this 
> problem.
> It is hard to reproduce this problem with simple tests, it only reveals on 
> heavy loads, but I did a small experiment to show the different between 
> `fileChannel.size()` vs `file.length()`, the small program could be seen in 
> the attachments.
> From the experiment results, I see that when `fileChannel.size` = 1MB, we get 
> `file.length()` = 0, which is exactly the case that we hit here.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-6614) EC: Fix Datanode block file INCONSISTENCY during heavy load.

Reply via email to