wchevreuil commented on code in PR #7685:
URL: https://github.com/apache/hbase/pull/7685#discussion_r2741624066


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1611,10 +1611,24 @@ private void retrieveFromFile(int[] bucketSizes) throws 
IOException {
       int pblen = ProtobufMagic.lengthOfPBMagic();
       byte[] pbuf = new byte[pblen];
       IOUtils.readFully(in, pbuf, 0, pblen);
+
+      // HBASE-29857: Validate that the persistence file has data after the 
magic bytes.
+      // A truncated or corrupted file may only contain magic bytes without 
actual cache data.
+      if (in.available() == 0) {
+        throw new IOException("Persistence file appears to be truncated or 
corrupted. "
+          + "File contains only magic bytes without cache data: " + 
persistencePath);
+      }
+

Review Comment:
   Do we really need this extra check? I guess if there's nothing after pb 
magic in the stream, we will reach the condition where 
`BucketCacheProtos.BucketCacheEntry.parseDelimitedFrom(in)` returns null, which 
is already been validated for both cases. 



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java:
##########
@@ -1755,6 +1755,13 @@ private void retrieveChunkedBackingMap(FileInputStream 
in) throws IOException {
     BucketCacheProtos.BucketCacheEntry cacheEntry =
       BucketCacheProtos.BucketCacheEntry.parseDelimitedFrom(in);
 
+    // HBASE-29857: Handle case where persistence file is empty or corrupted.
+    // parseDelimitedFrom() returns null when there's no data to read.
+    if (cacheEntry == null) {

Review Comment:
   We should re-phrase the comment, as this only handles the case for empty 
files. Corrupt files would likely cause a InvalidProtocolBufferException, which 
is a sub-class of IOException. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to