[ 
https://issues.apache.org/jira/browse/HDDS-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047593#comment-18047593
 ] 

Ivan Andika edited comment on HDDS-12007 at 12/26/25 9:52 AM:
--------------------------------------------------------------

> link(..) should still be called for recording an entry into the RaftLog. It 
> just that link() should not trigger PutBlock.

Got it. I tested it again and `sendForward` invokes the following StateMachine 
methods
 * ContainerStateMachine#link: This currently set the linked flag, which seems 
to prevent LocalStream#cleanUp to close the file, since it's already closed 
successfully through DataStreamManagement#close. However, this does not seem to 
protect the channel entirely since if the LocalStream#cleanUp is triggered 
between the StandardWriteOption.CLOSE and the ContainerStateMachine#link, the 
issue might still happen. To be honest, I don't understand the original intent 
of StateMachine#link and how it's supposed to be used so any feedback is 
appreciated.
 * ContainerStateMachine#applyTransaction: This will trigger 
KeyValueHandler#streamInit since the BlockDataStreamOutput#setupStream calls 
DataStreamApi#stream with ContainerCommandRequestProto with type StreamInit
 ** The KeyValueStreamHandler#streamInit only returns the chunk file path and 
does not seem to be used by the client
 ** Since isWriteStage, isWriteCommitStage, isCombinedStage are all false, the 
applyTransaction does not seem to do anything useful in Ozone context

sendForward will create a new Ratis transaction, but the following 
putBlockAsync also triggers a Ratis transaction.

IMO StreamInit does not seem to be useful and we can deprecate / remove it. If 
we want to send the last PutBlock in streaming and remove the async 
putBlockAsync, one idea is to pass a PutBlockRequest in the closeAsync instead 
of in the writeAsync in executePutBlockClose, so it will trigger 
StateMachine#link and also do useful works by executing PutBlock in 
ContaineStateMachine#applyTransaction (unlike StreamInit). Then we can remove 
the following putBlockAsync.

 

Attached is the debug logs I did
{code:java}
# Before closeAsync
2025-12-26 16:53:33,439 [Time-limited test] INFO  storage.BlockDataStreamOutput 
(BlockDataStreamOutput.java:executePutBlock(431)) - Close sent

# ContainerStateMachine#link
2025-12-26 16:53:33,475 [f9293f83-71ef-41e5-8f1b-a7e20c7ec39f-client-thread1] 
INFO  ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link 
detected
2025-12-26 16:53:33,479 [f5e6f34c-de7d-407c-9836-080132f32b53-server-thread1] 
INFO  ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link 
detected
2025-12-26 16:53:33,479 [331f2e49-0fad-4dea-99ac-336bd58aa221-server-thread1] 
INFO  ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link 
detected

# ContainerStateMachine#applyTransaction (cmdType: StreamInit)
2025-12-26 16:53:33,508 
[f9293f83-71ef-41e5-8f1b-a7e20c7ec39f@group-FF20505199F1-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(972)) - Container command request 
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
  blockID {
    containerID: 1
    localID: 115816896921600001
    blockCommitSequenceId: 0
    storageTypeID: 1
  }
}
version: 3

2025-12-26 16:53:33,508 
[331f2e49-0fad-4dea-99ac-336bd58aa221@group-FF20505199F1-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(972)) - Container command request 
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
  blockID {
    containerID: 1
    localID: 115816896921600001
    blockCommitSequenceId: 0
    storageTypeID: 1
  }
}
version: 3

2025-12-26 16:53:33,508 
[f5e6f34c-de7d-407c-9836-080132f32b53@group-FF20505199F1-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(972)) - Container command request 
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
  blockID {
    containerID: 1
    localID: 115816896921600001
    blockCommitSequenceId: 0
    storageTypeID: 1
  }
}
version: 3

# StreamInit stage is COMBINED with BCSID specified
2025-12-26 17:02:33,428 
[f791e008-7ea6-4821-abe3-88dcd27fbd45@group-19E545098233-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage 
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428 
[34b47435-72e5-4a18-b028-9a1e90fe013d@group-19E545098233-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage 
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428 
[20762959-d534-45ba-8915-0a4f356341dc@group-19E545098233-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage 
COMBINED, BCSID map {1=0}

2025-12-26 17:06:15,664 
[20fb6836-36e5-4eb4-9ea8-14707e46b1fb-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
 INFO  impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) - 
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,665 
[7b96b7d4-dc30-4528-b9b7-a49c322fb44f-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
 INFO  impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) - 
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,666 
[aa25337d-dcca-4a62-a657-031efd99e3f3-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
 INFO  impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) - 
isWriteStage false, isWriteCommitStage false, isCombinedStage false {code}
cc: [~XiChen] 


was (Author: JIRAUSER298977):
> link(..) should still be called for recording an entry into the RaftLog. It 
> just that link() should not trigger PutBlock.

Got it. I tested it again and `sendForward` invokes the following StateMachine 
methods
 * ContainerStateMachine#link: This currently set the linked flag, which seems 
to prevent LocalStream#cleanUp to close the file, since it's already closed 
successfully through DataStreamManagement#close. However, this does not seem to 
protect the channel entirely since if the LocalStream#cleanUp is triggered 
between the StandardWriteOption.CLOSE and the ContainerStateMachine#link, the 
issue might still happen.
 * ContainerStateMachine#applyTransaction: This will trigger 
KeyValueHandler#streamInit since the BlockDataStreamOutput#setupStream calls 
DataStreamApi#stream with ContainerCommandRequestProto with type StreamInit
 ** The KeyValueStreamHandler#streamInit only returns the chunk file path and 
does not seem to be used by the client
 ** Since isWriteStage, isWriteCommitStage, isCombinedStage are all false, the 
applyTransaction does not seem to do anything useful in Ozone context

sendForward will create a new Ratis transaction, but the following 
putBlockAsync also triggers a Ratis transaction.

IMO StreamInit does not seem to be useful and we can deprecate / remove it. If 
we want to send the last PutBlock in streaming and remove the async 
putBlockAsync, one idea is to pass a PutBlockRequest in the closeAsync instead 
of in the writeAsync in executePutBlockClose, so it will trigger 
StateMachine#link and also do useful works by executing PutBlock in 
ContaineStateMachine#applyTransaction (unlike StreamInit). Then we can remove 
the following putBlockAsync.

 

Attached is the debug logs I did
{code:java}
# Before closeAsync
2025-12-26 16:53:33,439 [Time-limited test] INFO  storage.BlockDataStreamOutput 
(BlockDataStreamOutput.java:executePutBlock(431)) - Close sent

# ContainerStateMachine#link
2025-12-26 16:53:33,475 [f9293f83-71ef-41e5-8f1b-a7e20c7ec39f-client-thread1] 
INFO  ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link 
detected
2025-12-26 16:53:33,479 [f5e6f34c-de7d-407c-9836-080132f32b53-server-thread1] 
INFO  ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link 
detected
2025-12-26 16:53:33,479 [331f2e49-0fad-4dea-99ac-336bd58aa221-server-thread1] 
INFO  ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link 
detected

# ContainerStateMachine#applyTransaction (cmdType: StreamInit)
2025-12-26 16:53:33,508 
[f9293f83-71ef-41e5-8f1b-a7e20c7ec39f@group-FF20505199F1-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(972)) - Container command request 
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
  blockID {
    containerID: 1
    localID: 115816896921600001
    blockCommitSequenceId: 0
    storageTypeID: 1
  }
}
version: 3

2025-12-26 16:53:33,508 
[331f2e49-0fad-4dea-99ac-336bd58aa221@group-FF20505199F1-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(972)) - Container command request 
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
  blockID {
    containerID: 1
    localID: 115816896921600001
    blockCommitSequenceId: 0
    storageTypeID: 1
  }
}
version: 3

2025-12-26 16:53:33,508 
[f5e6f34c-de7d-407c-9836-080132f32b53@group-FF20505199F1-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(972)) - Container command request 
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
  blockID {
    containerID: 1
    localID: 115816896921600001
    blockCommitSequenceId: 0
    storageTypeID: 1
  }
}
version: 3

# StreamInit stage is COMBINED with BCSID specified
2025-12-26 17:02:33,428 
[f791e008-7ea6-4821-abe3-88dcd27fbd45@group-19E545098233-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage 
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428 
[34b47435-72e5-4a18-b028-9a1e90fe013d@group-19E545098233-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage 
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428 
[20762959-d534-45ba-8915-0a4f356341dc@group-19E545098233-StateMachineUpdater] 
INFO  ratis.ContainerStateMachine 
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage 
COMBINED, BCSID map {1=0}

2025-12-26 17:06:15,664 
[20fb6836-36e5-4eb4-9ea8-14707e46b1fb-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
 INFO  impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) - 
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,665 
[7b96b7d4-dc30-4528-b9b7-a49c322fb44f-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
 INFO  impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) - 
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,666 
[aa25337d-dcca-4a62-a657-031efd99e3f3-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
 INFO  impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) - 
isWriteStage false, isWriteCommitStage false, isCombinedStage false {code}
cc: [~XiChen] 

> BlockDataStreamOutput should only send one PutBlock during close
> ----------------------------------------------------------------
>
>                 Key: HDDS-12007
>                 URL: https://issues.apache.org/jira/browse/HDDS-12007
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: Ozone Datanode
>            Reporter: Ivan Andika
>            Assignee: Tsz-wo Sze
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.0.0
>
>
> Currently, during close two PutBlock request will be sent
>  * executePutBlockClose: This uses the DataStreamOutput#writeAsync with 
> StandardWriteOption.CLOSE as part of the HDDS-6500 improvements
>  ** This will call sendForward which will trigger ContainerStateMachine#link 
> which will be processed like PutBlock
>  * putBlockAsync: This is a normal PutBlock request which is executed per 
> block boundary (similar to Write Pipeline V1)
> We should only call executePutBlockClose during close. We can use 
> ClientProtoUtils#getRaftClientReply to convert from DataStreamReply to 
> RaftClientReply which we can use to derive the PutBlock response.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to