[
https://issues.apache.org/jira/browse/HDDS-10821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duong updated HDDS-10821:
-------------------------
Description:
It's very rare but I got the following error in Datanode handling of writeChunk.
{code:java}
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
Unexpected write size. expected: 4194304, actual: 4145507
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.checkSize(ChunkUtils.java:453)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.validateWriteSize(ChunkUtils.java:438)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.writeData(ChunkUtils.java:153)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.writeData(ChunkUtils.java:121)
at
org.apache.hadoop.ozone.container.keyvalue.impl.FilePerBlockStrategy.writeChunk(FilePerBlockStrategy.java:167)
at
org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerDispatcher.writeChunk(ChunkManagerDispatcher.java:75)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleWriteChunk(KeyValueHandler.java:802)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:263)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:222)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:345)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:193)
at
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:89)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:192)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:486)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$writeStateMachineData$3(ContainerStateMachine.java:542)
{code}
It's from the following code in
[ChunkUtil.writeData|https://github.com/apache/ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L125-L125]:
{code:java}
private static void writeData(ChunkBuffer data, String filename,
long offset, long len, HddsVolume volume,
ToLongFunction<ChunkBuffer> writer) throws StorageContainerException {
validateBufferSize(len, data.remaining());
....
final long bytesWritten;
try {
bytesWritten = writer.applyAsLong(data);
} catch (UncheckedIOException e) {
...
throw wrapInStorageContainerException(e.getCause());
}
...
validateWriteSize(len, bytesWritten);
} {code}
The error indicates that DN received a buffer (data) with sufficient size
(validateBufferSize), then wrote it using FileChannel.write, and realized that
the buffer was not fully written.
Although the documentation is quite vague, it's generally agreed that
FileChannel.write doesn't promise to write all buffer content in a go. A loop
will be needed.
[https://stackoverflow.com/questions/29945685/will-filechannelwrite-always-write-the-whole-buffer]
[https://stackoverflow.com/questions/29002366/filechannel-write-incomplete]
[https://github.com/elastic/elasticsearch/blob/75b5efede488f130e5fafc25bae7a648772ffdc4/server/src/main/java/org/elasticsearch/common/io/Channels.java#L178]
was:
It's very rare but I got the following error in Datanode handling of writeChunk.
{code:java}
org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
Unexpected write size. expected: 4194304, actual: 4145507
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.checkSize(ChunkUtils.java:453)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.validateWriteSize(ChunkUtils.java:438)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.writeData(ChunkUtils.java:153)
at
org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.writeData(ChunkUtils.java:121)
at
org.apache.hadoop.ozone.container.keyvalue.impl.FilePerBlockStrategy.writeChunk(FilePerBlockStrategy.java:167)
at
org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerDispatcher.writeChunk(ChunkManagerDispatcher.java:75)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleWriteChunk(KeyValueHandler.java:802)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:263)
at
org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:222)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:345)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:193)
at
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:89)
at
org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:192)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:486)
at
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$writeStateMachineData$3(ContainerStateMachine.java:542)
{code}
It's from the following code in
[ChunkUtil.writeData|https://github.com/apache/ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L125-L125]:
{code:java}
private static void writeData(ChunkBuffer data, String filename,
long offset, long len, HddsVolume volume,
ToLongFunction<ChunkBuffer> writer) throws StorageContainerException {
validateBufferSize(len, data.remaining());
....
final long bytesWritten;
try {
bytesWritten = writer.applyAsLong(data);
} catch (UncheckedIOException e) {
...
throw wrapInStorageContainerException(e.getCause());
}
...
validateWriteSize(len, bytesWritten);
} {code}
The error indicates that DN received a buffer (data) with sufficient size
(validateBufferSize), then wrote it using FileChannel.write, and realized that
the buffer was not fully written.
Althought
> Ensure ozone to write all chunk buffer content to FileChannel
> -------------------------------------------------------------
>
> Key: HDDS-10821
> URL: https://issues.apache.org/jira/browse/HDDS-10821
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Duong
> Assignee: Duong
> Priority: Major
>
> It's very rare but I got the following error in Datanode handling of
> writeChunk.
> {code:java}
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> Unexpected write size. expected: 4194304, actual: 4145507
> at
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.checkSize(ChunkUtils.java:453)
> at
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.validateWriteSize(ChunkUtils.java:438)
> at
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.writeData(ChunkUtils.java:153)
> at
> org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.writeData(ChunkUtils.java:121)
> at
> org.apache.hadoop.ozone.container.keyvalue.impl.FilePerBlockStrategy.writeChunk(FilePerBlockStrategy.java:167)
> at
> org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerDispatcher.writeChunk(ChunkManagerDispatcher.java:75)
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleWriteChunk(KeyValueHandler.java:802)
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.dispatchRequest(KeyValueHandler.java:263)
> at
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:222)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:345)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.lambda$dispatch$0(HddsDispatcher.java:193)
> at
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:89)
> at
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:192)
> at
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:486)
> at
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$writeStateMachineData$3(ContainerStateMachine.java:542)
> {code}
> It's from the following code in
> [ChunkUtil.writeData|https://github.com/apache/ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L125-L125]:
> {code:java}
> private static void writeData(ChunkBuffer data, String filename,
> long offset, long len, HddsVolume volume,
> ToLongFunction<ChunkBuffer> writer) throws StorageContainerException {
> validateBufferSize(len, data.remaining());
> ....
> final long bytesWritten;
> try {
> bytesWritten = writer.applyAsLong(data);
> } catch (UncheckedIOException e) {
> ...
> throw wrapInStorageContainerException(e.getCause());
> }
> ...
> validateWriteSize(len, bytesWritten);
> } {code}
> The error indicates that DN received a buffer (data) with sufficient size
> (validateBufferSize), then wrote it using FileChannel.write, and realized
> that the buffer was not fully written.
> Although the documentation is quite vague, it's generally agreed that
> FileChannel.write doesn't promise to write all buffer content in a go. A loop
> will be needed.
> [https://stackoverflow.com/questions/29945685/will-filechannelwrite-always-write-the-whole-buffer]
> [https://stackoverflow.com/questions/29002366/filechannel-write-incomplete]
> [https://github.com/elastic/elasticsearch/blob/75b5efede488f130e5fafc25bae7a648772ffdc4/server/src/main/java/org/elasticsearch/common/io/Channels.java#L178]
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]