[
https://issues.apache.org/jira/browse/HDDS-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047593#comment-18047593
]
Ivan Andika edited comment on HDDS-12007 at 12/26/25 9:52 AM:
--------------------------------------------------------------
> link(..) should still be called for recording an entry into the RaftLog. It
> just that link() should not trigger PutBlock.
Got it. I tested it again and `sendForward` invokes the following StateMachine
methods
* ContainerStateMachine#link: This currently set the linked flag, which seems
to prevent LocalStream#cleanUp to close the file, since it's already closed
successfully through DataStreamManagement#close. However, this does not seem to
protect the channel entirely since if the LocalStream#cleanUp is triggered
between the StandardWriteOption.CLOSE and the ContainerStateMachine#link, the
issue might still happen. To be honest, I don't understand the original intent
of StateMachine#link and how it's supposed to be used so any feedback is
appreciated.
* ContainerStateMachine#applyTransaction: This will trigger
KeyValueHandler#streamInit since the BlockDataStreamOutput#setupStream calls
DataStreamApi#stream with ContainerCommandRequestProto with type StreamInit
** The KeyValueStreamHandler#streamInit only returns the chunk file path and
does not seem to be used by the client
** Since isWriteStage, isWriteCommitStage, isCombinedStage are all false, the
applyTransaction does not seem to do anything useful in Ozone context
sendForward will create a new Ratis transaction, but the following
putBlockAsync also triggers a Ratis transaction.
IMO StreamInit does not seem to be useful and we can deprecate / remove it. If
we want to send the last PutBlock in streaming and remove the async
putBlockAsync, one idea is to pass a PutBlockRequest in the closeAsync instead
of in the writeAsync in executePutBlockClose, so it will trigger
StateMachine#link and also do useful works by executing PutBlock in
ContaineStateMachine#applyTransaction (unlike StreamInit). Then we can remove
the following putBlockAsync.
Attached is the debug logs I did
{code:java}
# Before closeAsync
2025-12-26 16:53:33,439 [Time-limited test] INFO storage.BlockDataStreamOutput
(BlockDataStreamOutput.java:executePutBlock(431)) - Close sent
# ContainerStateMachine#link
2025-12-26 16:53:33,475 [f9293f83-71ef-41e5-8f1b-a7e20c7ec39f-client-thread1]
INFO ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link
detected
2025-12-26 16:53:33,479 [f5e6f34c-de7d-407c-9836-080132f32b53-server-thread1]
INFO ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link
detected
2025-12-26 16:53:33,479 [331f2e49-0fad-4dea-99ac-336bd58aa221-server-thread1]
INFO ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link
detected
# ContainerStateMachine#applyTransaction (cmdType: StreamInit)
2025-12-26 16:53:33,508
[f9293f83-71ef-41e5-8f1b-a7e20c7ec39f@group-FF20505199F1-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(972)) - Container command request
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
blockID {
containerID: 1
localID: 115816896921600001
blockCommitSequenceId: 0
storageTypeID: 1
}
}
version: 3
2025-12-26 16:53:33,508
[331f2e49-0fad-4dea-99ac-336bd58aa221@group-FF20505199F1-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(972)) - Container command request
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
blockID {
containerID: 1
localID: 115816896921600001
blockCommitSequenceId: 0
storageTypeID: 1
}
}
version: 3
2025-12-26 16:53:33,508
[f5e6f34c-de7d-407c-9836-080132f32b53@group-FF20505199F1-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(972)) - Container command request
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
blockID {
containerID: 1
localID: 115816896921600001
blockCommitSequenceId: 0
storageTypeID: 1
}
}
version: 3
# StreamInit stage is COMBINED with BCSID specified
2025-12-26 17:02:33,428
[f791e008-7ea6-4821-abe3-88dcd27fbd45@group-19E545098233-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428
[34b47435-72e5-4a18-b028-9a1e90fe013d@group-19E545098233-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428
[20762959-d534-45ba-8915-0a4f356341dc@group-19E545098233-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage
COMBINED, BCSID map {1=0}
2025-12-26 17:06:15,664
[20fb6836-36e5-4eb4-9ea8-14707e46b1fb-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
INFO impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) -
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,665
[7b96b7d4-dc30-4528-b9b7-a49c322fb44f-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
INFO impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) -
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,666
[aa25337d-dcca-4a62-a657-031efd99e3f3-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
INFO impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) -
isWriteStage false, isWriteCommitStage false, isCombinedStage false {code}
cc: [~XiChen]
was (Author: JIRAUSER298977):
> link(..) should still be called for recording an entry into the RaftLog. It
> just that link() should not trigger PutBlock.
Got it. I tested it again and `sendForward` invokes the following StateMachine
methods
* ContainerStateMachine#link: This currently set the linked flag, which seems
to prevent LocalStream#cleanUp to close the file, since it's already closed
successfully through DataStreamManagement#close. However, this does not seem to
protect the channel entirely since if the LocalStream#cleanUp is triggered
between the StandardWriteOption.CLOSE and the ContainerStateMachine#link, the
issue might still happen.
* ContainerStateMachine#applyTransaction: This will trigger
KeyValueHandler#streamInit since the BlockDataStreamOutput#setupStream calls
DataStreamApi#stream with ContainerCommandRequestProto with type StreamInit
** The KeyValueStreamHandler#streamInit only returns the chunk file path and
does not seem to be used by the client
** Since isWriteStage, isWriteCommitStage, isCombinedStage are all false, the
applyTransaction does not seem to do anything useful in Ozone context
sendForward will create a new Ratis transaction, but the following
putBlockAsync also triggers a Ratis transaction.
IMO StreamInit does not seem to be useful and we can deprecate / remove it. If
we want to send the last PutBlock in streaming and remove the async
putBlockAsync, one idea is to pass a PutBlockRequest in the closeAsync instead
of in the writeAsync in executePutBlockClose, so it will trigger
StateMachine#link and also do useful works by executing PutBlock in
ContaineStateMachine#applyTransaction (unlike StreamInit). Then we can remove
the following putBlockAsync.
Attached is the debug logs I did
{code:java}
# Before closeAsync
2025-12-26 16:53:33,439 [Time-limited test] INFO storage.BlockDataStreamOutput
(BlockDataStreamOutput.java:executePutBlock(431)) - Close sent
# ContainerStateMachine#link
2025-12-26 16:53:33,475 [f9293f83-71ef-41e5-8f1b-a7e20c7ec39f-client-thread1]
INFO ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link
detected
2025-12-26 16:53:33,479 [f5e6f34c-de7d-407c-9836-080132f32b53-server-thread1]
INFO ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link
detected
2025-12-26 16:53:33,479 [331f2e49-0fad-4dea-99ac-336bd58aa221-server-thread1]
INFO ratis.ContainerStateMachine (ContainerStateMachine.java:link(643)) - Link
detected
# ContainerStateMachine#applyTransaction (cmdType: StreamInit)
2025-12-26 16:53:33,508
[f9293f83-71ef-41e5-8f1b-a7e20c7ec39f@group-FF20505199F1-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(972)) - Container command request
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
blockID {
containerID: 1
localID: 115816896921600001
blockCommitSequenceId: 0
storageTypeID: 1
}
}
version: 3
2025-12-26 16:53:33,508
[331f2e49-0fad-4dea-99ac-336bd58aa221@group-FF20505199F1-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(972)) - Container command request
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
blockID {
containerID: 1
localID: 115816896921600001
blockCommitSequenceId: 0
storageTypeID: 1
}
}
version: 3
2025-12-26 16:53:33,508
[f5e6f34c-de7d-407c-9836-080132f32b53@group-FF20505199F1-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(972)) - Container command request
proto cmdType: StreamInit
containerID: 1
datanodeUuid: "f9293f83-71ef-41e5-8f1b-a7e20c7ec39f"
pipelineID: "59f8a665-3a1d-48b8-91e9-ff20505199f1"
writeChunk {
blockID {
containerID: 1
localID: 115816896921600001
blockCommitSequenceId: 0
storageTypeID: 1
}
}
version: 3
# StreamInit stage is COMBINED with BCSID specified
2025-12-26 17:02:33,428
[f791e008-7ea6-4821-abe3-88dcd27fbd45@group-19E545098233-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428
[34b47435-72e5-4a18-b028-9a1e90fe013d@group-19E545098233-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage
COMBINED, BCSID map {1=0}
2025-12-26 17:02:33,428
[20762959-d534-45ba-8915-0a4f356341dc@group-19E545098233-StateMachineUpdater]
INFO ratis.ContainerStateMachine
(ContainerStateMachine.java:applyTransaction(914)) - DispatcherContext stage
COMBINED, BCSID map {1=0}
2025-12-26 17:06:15,664
[20fb6836-36e5-4eb4-9ea8-14707e46b1fb-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
INFO impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) -
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,665
[7b96b7d4-dc30-4528-b9b7-a49c322fb44f-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
INFO impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) -
isWriteStage false, isWriteCommitStage false, isCombinedStage false
2025-12-26 17:06:15,666
[aa25337d-dcca-4a62-a657-031efd99e3f3-ContainerOp-6639a349-32f6-452f-9533-8c49537ceeef-2]
INFO impl.HddsDispatcher (HddsDispatcher.java:dispatchRequest(257)) -
isWriteStage false, isWriteCommitStage false, isCombinedStage false {code}
cc: [~XiChen]
> BlockDataStreamOutput should only send one PutBlock during close
> ----------------------------------------------------------------
>
> Key: HDDS-12007
> URL: https://issues.apache.org/jira/browse/HDDS-12007
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: Ozone Datanode
> Reporter: Ivan Andika
> Assignee: Tsz-wo Sze
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.0.0
>
>
> Currently, during close two PutBlock request will be sent
> * executePutBlockClose: This uses the DataStreamOutput#writeAsync with
> StandardWriteOption.CLOSE as part of the HDDS-6500 improvements
> ** This will call sendForward which will trigger ContainerStateMachine#link
> which will be processed like PutBlock
> * putBlockAsync: This is a normal PutBlock request which is executed per
> block boundary (similar to Write Pipeline V1)
> We should only call executePutBlockClose during close. We can use
> ClientProtoUtils#getRaftClientReply to convert from DataStreamReply to
> RaftClientReply which we can use to derive the PutBlock response.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]