sodonnel commented on PR #5364:
URL: https://github.com/apache/ozone/pull/5364#issuecomment-1735849785

   I recall an earlier conversation we had on Slack, where you pointed out that 
the reason the current code does not work well, is because it does this:
   
   ```
     public void update(ByteBuffer buffer) {
       if (buffer.hasArray()) {
         checksum.update(buffer.array(), buffer.position() + 
buffer.arrayOffset(),
             buffer.remaining());
       } else {
         byte[] b = new byte[buffer.remaining()];
         buffer.get(b);
         checksum.update(b, 0, b.length);
       }
     }
   ```
   
   `buffer.hasArray()` always returns false if the buffer is readOnly, and it 
our case, the buffer is created by a call to 
`writeChunk.getData().asReadOnlyByteBufferList()` which is 
`ByteString.asReadOnlyByteBufferList()`.
   
   Looking inside `KeyValueHandler.java` where the chunk writes happen on the 
Datanode and where the checksum validation happens, the code looks like:
   
   ```
         WriteChunkRequestProto writeChunk = request.getWriteChunk();
         BlockID blockID = BlockID.getFromProtobuf(writeChunk.getBlockID());
         ContainerProtos.ChunkInfo chunkInfoProto = writeChunk.getChunkData();
         ChunkInfo chunkInfo = ChunkInfo.getFromProtoBuf(chunkInfoProto);
         Preconditions.checkNotNull(chunkInfo);
   
         ChunkBuffer data = null;
         if (dispatcherContext == null) {
           dispatcherContext = new DispatcherContext.Builder().build();
         }
         WriteChunkStage stage = dispatcherContext.getStage();
         if (stage == WriteChunkStage.WRITE_DATA ||
             stage == WriteChunkStage.COMBINED) {
           data =
               
ChunkBuffer.wrap(writeChunk.getData().asReadOnlyByteBufferList());
           validateChunkChecksumData(data, chunkInfo);
   ```
   
   We can see that pull the data from the proto as a byteString and then create 
the asReadOnlyByteBufferList, which basically copies the proto into a 
ByteBuffer.
   
   Then inside `validateChunkChecksumData`, the code looks like:
   
   ```
     private void validateChunkChecksumData(ChunkBuffer data, ChunkInfo info)
         throws StorageContainerException {
       if (validateChunkChecksumData) {
         try {
           Checksum.verifyChecksum(data.toByteString(byteBufferToByteString),
               info.getChecksumData(), 0);
         } catch (OzoneChecksumException ex) {
           throw ChunkUtils.wrapInStorageContainerException(ex);
         }
       }
     }
   ```
   
   And then:
   
   ```
     public static boolean verifyChecksum(ByteString byteString,
         ChecksumData checksumData, int startIndex) throws 
OzoneChecksumException {
       final ByteBuffer buffer = byteString.asReadOnlyByteBuffer();
       return verifyChecksum(buffer, checksumData, startIndex);
   }
   ```
   
   So we start with a byteString in the protobuf. Then we convert that into a 
ByteBuffer, back to a byteString and back to a ByteBuffer and then in this PR a 
final copy from ByteBuffer to ByteBuffer!
   
   Were you seeing excessive memory usage on the DN side when writing chunks 
and with chunk.data.validation.check=true (the default is false, which is 
frankly crazy and will result in dataloss some day).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to