wwj6591812 commented on code in PR #8219:
URL: https://github.com/apache/paimon/pull/8219#discussion_r3410638645


##########
paimon-format/src/main/java/org/apache/paimon/format/blob/BlobFormatWriter.java:
##########
@@ -96,22 +112,35 @@ public void addElement(InternalRow element) throws 
IOException {
             return;
         }
 
-        long previousPos = out.getPos();
-        crc32.reset();
+        SeekableInputStream in;
+        try {
+            in = blob.newInputStream();
+        } catch (IOException | RuntimeException e) {
+            if (writeNullOnMissingFile && isNotFoundError(e)) {
+                LOG.warn(
+                        "Failed to open blob from {}, writing NULL for BLOB 
field {}.",
+                        blobUri(blob),
+                        blobFieldName,
+                        e);
+                writeNullElement();
+                return;
+            }
+            throw e;
+        }
 
         write(MAGIC_NUMBER_BYTES);

Review Comment:
   @JingsongLi 
   
   Good catch — you're right. crc32.reset() was accidentally dropped when I 
removed the in-memory staging path.
   
   I've restored crc32.reset() before writing MAGIC_NUMBER_BYTES for each 
non-null blob, so every per-record CRC is computed only over that record's 
bytes again (matching the old streaming behavior). I also added 
testTwoConsecutiveBlobsPreserveReadback to cover writing two consecutive blobs 
in the same file.
   
   Thanks again for the careful review!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to