the-other-tim-brown commented on code in PR #13977:
URL: https://github.com/apache/hudi/pull/13977#discussion_r2377001956


##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileWriterImpl.java:
##########
@@ -92,16 +92,17 @@ public void append(String key, byte[] value) throws 
IOException {
     totalKeyLength += keyBytes.length;
     totalValueLength += value.length;
     // Records with the same key must be put into the same block.
-    // Here 9 = 4 bytes of key length + 4 bytes of value length + 1 byte MVCC.
+    // Here 9 = 4 bytes of key length + 4 bytes of value length + 1 byte MVCC

Review Comment:
   ```suggestion
       // Here 19 = 4 bytes of key length + 4 bytes of value length + 1 byte 
MVCC
   ```



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileWriterImpl.java:
##########
@@ -92,16 +92,17 @@ public void append(String key, byte[] value) throws 
IOException {
     totalKeyLength += keyBytes.length;
     totalValueLength += value.length;
     // Records with the same key must be put into the same block.
-    // Here 9 = 4 bytes of key length + 4 bytes of value length + 1 byte MVCC.
+    // Here 9 = 4 bytes of key length + 4 bytes of value length + 1 byte MVCC
+    //          + 10 bytes (check sum, timestamp, key type).
     if (!Arrays.equals(currentDataBlock.getLastKeyContent(), keyBytes)
-        && uncompressedDataBlockBytes + keyBytes.length + value.length + 9 > 
blockSize) {
+        && uncompressedDataBlockBytes + keyBytes.length + value.length + 19 > 
blockSize) {
       flushCurrentDataBlock();
       uncompressedDataBlockBytes = 0;
     }
     currentDataBlock.add(keyBytes, value);
     int uncompressedKeyValueSize = keyBytes.length + value.length;
-    uncompressedDataBlockBytes += uncompressedKeyValueSize + 9;
-    totalUncompressedDataBlockBytes += uncompressedKeyValueSize + 9;
+    uncompressedDataBlockBytes += uncompressedKeyValueSize + 19;

Review Comment:
   Should we add `19` as a constant in the file?



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileDataBlock.java:
##########
@@ -207,40 +208,35 @@ byte[] getLastKeyContent() {
   }
 
   @Override
-  protected ByteBuffer getUncompressedBlockDataToWrite() {
-    ByteArrayOutputStream baos = new ByteArrayOutputStream();
-    ByteBuffer dataBuf = ByteBuffer.allocate(context.getBlockSize());
+  protected ByteBuffer getUncompressedBlockDataToWrite() throws IOException {
+    ByteArrayOutputStream dataBuf = new ByteArrayOutputStream();

Review Comment:
   can we initialize this with an initial buffer size?



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileFileInfoBlock.java:
##########
@@ -87,8 +88,8 @@ public boolean containsKey(String name) {
   }
 
   @Override
-  public ByteBuffer getUncompressedBlockDataToWrite() {
-    ByteBuffer buff = ByteBuffer.allocate(context.getBlockSize() * 2);
+  public ByteBuffer getUncompressedBlockDataToWrite() throws IOException {
+    ByteArrayOutputStream buff = new ByteArrayOutputStream();

Review Comment:
   Similarly here



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileMetaBlock.java:
##########
@@ -56,12 +58,11 @@ public byte[] getFirstKey() {
   }
 
   @Override
-  public ByteBuffer getUncompressedBlockDataToWrite() {
-    ByteBuffer dataBuf = ByteBuffer.allocate(context.getBlockSize());
+  public ByteBuffer getUncompressedBlockDataToWrite() throws IOException {
+    ByteArrayOutputStream dataBuf = new ByteArrayOutputStream();

Review Comment:
   Can we just simplify this whole method to `return 
ByteBuffer.wrap(entryToWrite.value);`?
   



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileMetaIndexBlock.java:
##########
@@ -33,27 +34,27 @@ public static HFileMetaIndexBlock 
createMetaIndexBlockToWrite(HFileContext conte
   }
 
   @Override
-  public ByteBuffer getUncompressedBlockDataToWrite() {
-    ByteBuffer buf = ByteBuffer.allocate(context.getBlockSize() * 2);
+  public ByteBuffer getUncompressedBlockDataToWrite() throws IOException {
+    ByteArrayOutputStream buf = new ByteArrayOutputStream();

Review Comment:
   Same here with the initial size



##########
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileBlock.java:
##########
@@ -267,40 +267,40 @@ public ByteBuffer serialize() throws IOException {
     // Compress if specified.
     ByteBuffer compressedBlockData = 
context.getCompressor().compress(uncompressedBlockData);
     // Buffer for building block.
-    ByteBuffer buf = ByteBuffer.allocate(Math.max(
-        context.getBlockSize() * 2,
+    ByteArrayOutputStream buf = new ByteArrayOutputStream(Math.max(
+        context.getBlockSize(),
         compressedBlockData.limit() + HFILEBLOCK_HEADER_SIZE * 2));
 
     // Block header
     // 1. Magic is always 8 bytes.
-    buf.put(blockType.getMagic(), 0, 8);
+    buf.write(blockType.getMagic(), 0, 8);
     // 2. onDiskSizeWithoutHeader.
     int compressedDataSize = compressedBlockData.limit();
     int onDiskDataSizeWithHeader = HFileBlock.HFILEBLOCK_HEADER_SIZE + 
compressedDataSize;
     int numChecksumBytes = numChecksumBytes(onDiskDataSizeWithHeader, 
DEFAULT_BYTES_PER_CHECKSUM);
-    buf.putInt(compressedDataSize + numChecksumBytes);
+    HFileUtils.writeInt(buf, compressedDataSize + numChecksumBytes);

Review Comment:
   If we use `DataOutputStream` to wrap the `ByteArrayOutputStream` you will 
get all these utilities instead of writing your own



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to