Siyao Meng created HDDS-10411:
---------------------------------
Summary: Support incremental ChunkBuffer checksum calculation
Key: HDDS-10411
URL: https://issues.apache.org/jira/browse/HDDS-10411
Project: Apache Ozone
Issue Type: Sub-task
Reporter: Siyao Meng
Assignee: Siyao Meng
h3. Goal
Calculate ChunkBuffer (ByteBuffer) checksum incrementally rather than having to
calculating it from scratch every single time in {{writeChunkToContainer}}.
---
h3. Background
Currently, ChunkBuffer(ByteBuffer) checksum is always calculated from scratch.
As can be seen here in {{newChecksumByteBufferFunction}}, which it always calls
reset() before feeding data with update():
{code:title=[newChecksumByteBufferFunction|https://github.com/apache/ozone/blob/5f0925e190f1dbf2d4617daad8ad42401d5079e1/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java#L67-L68]}
private static Function<ByteBuffer, ByteString> newChecksumByteBufferFunction(
Supplier<ChecksumByteBuffer> constructor) {
final ChecksumByteBuffer algorithm = constructor.get();
return data -> {
algorithm.reset();
algorithm.update(data);
return int2ByteString((int)algorithm.getValue());
};
}
{code}
Each ByteBuffer (4 MB by default) inside a block's ChunkBuffer gets its
checksum calculated here:
{code:title=https://github.com/apache/ozone/blob/5f0925e190f1dbf2d4617daad8ad42401d5079e1/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java#L171-L177}
// Checksum is computed for each bytesPerChecksum number of bytes of data
// starting at offset 0. The last checksum might be computed for the
// remaining data with length less than bytesPerChecksum.
final List<ByteString> checksumList = new ArrayList<>();
for (ByteBuffer b : data.iterate(bytesPerChecksum)) {
checksumList.add(computeChecksum(b, function, bytesPerChecksum));
}
{code}
which is called from
[{{BlockOutputStream#writeChunkToContainer}}|https://github.com/apache/ozone/blob/f0b75b7e4ee93e89f9e4fc96cb30d59f78746eb5/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockOutputStream.java#L697].
And when the function is applied in the inner {{computeChecksum}}, it always
calls {{reset()}} first. So it calculates the whole ByteBuffer from offset 0.
h3. Motivation
While this may not be a big issue before Ozone {{hsync()}} is implemented or in
HDFS (where each chunk is much smaller, at 64 KB by default), it can now
contribute to ~10% of hsync latency between client-DN if the client is only
appending a few bytes between hsyncs, as can be seen from [~weichiu]'s flame
graph.
Estimated latency improvement is 0%~20% with this change, depending on the
client write/hsync pattern.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]