Siyao Meng created HDDS-10411:
---------------------------------

             Summary: Support incremental ChunkBuffer checksum calculation
                 Key: HDDS-10411
                 URL: https://issues.apache.org/jira/browse/HDDS-10411
             Project: Apache Ozone
          Issue Type: Sub-task
            Reporter: Siyao Meng
            Assignee: Siyao Meng


h3. Goal

Calculate ChunkBuffer (ByteBuffer) checksum incrementally rather than having to 
calculating it from scratch every single time in {{writeChunkToContainer}}.

---

h3. Background

Currently, ChunkBuffer(ByteBuffer) checksum is always calculated from scratch. 
As can be seen here in {{newChecksumByteBufferFunction}}, which it always calls 
reset() before feeding data with update():

{code:title=[newChecksumByteBufferFunction|https://github.com/apache/ozone/blob/5f0925e190f1dbf2d4617daad8ad42401d5079e1/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java#L67-L68]}
  private static Function<ByteBuffer, ByteString> newChecksumByteBufferFunction(
      Supplier<ChecksumByteBuffer> constructor) {
    final ChecksumByteBuffer algorithm = constructor.get();
    return  data -> {
      algorithm.reset();
      algorithm.update(data);
      return int2ByteString((int)algorithm.getValue());
    };
  }
{code}

Each ByteBuffer (4 MB by default) inside a block's ChunkBuffer gets its 
checksum calculated here:

{code:title=https://github.com/apache/ozone/blob/5f0925e190f1dbf2d4617daad8ad42401d5079e1/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java#L171-L177}
    // Checksum is computed for each bytesPerChecksum number of bytes of data
    // starting at offset 0. The last checksum might be computed for the
    // remaining data with length less than bytesPerChecksum.
    final List<ByteString> checksumList = new ArrayList<>();
    for (ByteBuffer b : data.iterate(bytesPerChecksum)) {
      checksumList.add(computeChecksum(b, function, bytesPerChecksum));
    }
{code}

which is called from 
[{{BlockOutputStream#writeChunkToContainer}}|https://github.com/apache/ozone/blob/f0b75b7e4ee93e89f9e4fc96cb30d59f78746eb5/hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockOutputStream.java#L697].

And when the function is applied in the inner {{computeChecksum}}, it always 
calls {{reset()}} first. So it calculates the whole ByteBuffer from offset 0.

h3. Motivation

While this may not be a big issue before Ozone {{hsync()}} is implemented or in 
HDFS (where each chunk is much smaller, at 64 KB by default), it can now 
contribute to ~10% of hsync latency between client-DN if the client is only 
appending a few bytes between hsyncs, as can be seen from [~weichiu]'s flame 
graph.

Estimated latency improvement is 0%~20% with this change, depending on the 
client write/hsync pattern.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to