suxiaogang223 opened a new issue, #3313:
URL: https://github.com/apache/paimon/issues/3313

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Paimon version
   
   0.8-SNAPSHOT
   
   ### Compute Engine
   
   JavaAPI
   
   ### Minimal reproduce step
   
   Nothing to do
   
   ### What doesn't meet your expectations?
   
   I'm trying to support deletion vector for doris' PaimonNativeReader. When I 
follow the offset and length in DeletionFile to read the content of the hdfs 
file to the local, I got an error when deserializing the content into 
RoaringBitmap, actually I found that the correct way is to read the content of 
length + 4 bytes to local.
   I guess that these 4 bytes are due to saving the serialized length of the 
DeletionVector when storing DeletionVector to index file.
   ```java
       static DeletionVector read(FileIO fileIO, DeletionFile deletionFile) 
throws IOException {
           Path path = new Path(deletionFile.path());
           try (SeekableInputStream input = fileIO.newInputStream(path)) {
               input.seek(deletionFile.offset());
               DataInputStream dis = new DataInputStream(input);
               int actualLength = dis.readInt();
               if (actualLength != deletionFile.length()) {
                   throw new RuntimeException(
                           "Size not match, actual size: "
                                   + actualLength
                                   + ", expert size: "
                                   + deletionFile.length()
                                   + ", file path: "
                                   + path);
               }
               int magicNum = dis.readInt();
               if (magicNum == BitmapDeletionVector.MAGIC_NUMBER) {
                   return BitmapDeletionVector.deserializeFromDataInput(dis);
               } else {
                   throw new RuntimeException("Invalid magic number: " + 
magicNum);
               }
           }
       }
   ```
   Maybe we should add 4 to length or offset in DeletionFile because it's very 
confusing.
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to