rahil-c commented on code in PR #18304:
URL: https://github.com/apache/hudi/pull/18304#discussion_r2955637177


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkLanceReader.java:
##########
@@ -63,19 +72,51 @@ public class HoodieSparkLanceReader implements 
HoodieSparkFileReader {
   // number of rows to read
   private static final int DEFAULT_BATCH_SIZE = 512;
   private final StoragePath path;
+  private final BufferAllocator metadataAllocator;
+  private final LanceFileReader lanceMetadataReader;
+  private final Schema arrowSchema;
 
   public HoodieSparkLanceReader(StoragePath path) {
     this.path = path;
+    metadataAllocator = HoodieArrowAllocator.newChildAllocator(
+        getClass().getSimpleName() + "-metadata-" + path.getName(), 
LANCE_METADATA_ALLOCATOR_SIZE);
+    try {
+      lanceMetadataReader = LanceFileReader.open(path.toString(), 
metadataAllocator);
+      arrowSchema = lanceMetadataReader.schema();
+    } catch (IOException e) {
+      throw new HoodieException("Failed to create lanceMetadataReader: " + 
path, e);
+    }

Review Comment:
   @wombatu-kun @voonhous 
   
   The issue is, if `LanceFileReader.open` succeeds but `schema()` throws, the 
opened lance file reader is never closed and has a reference to the allocator. 
Also, `non-IOException` exceptions bypass cleanup entirely so thinking we 
should catch a general `Exeception`.
                                                                                
                 
   ```
    try {                                                                       
                             
       lanceMetadataReader = LanceFileReader.open(path.toString(), 
metadataAllocator);
       arrowSchema = lanceMetadataReader.schema();                              
                              
     } catch (Exception e) {
       close();                                                                 
                              
       throw new HoodieException("Failed to create lanceMetadataReader: " + 
path, e);
     }    
   ```
   
   Since `lanceMetadataReader` is assigned before schema() is called, and 
`close()` already handles both        
    resources in the correct order with null checks, we can just call `close()` 
directly. If `open()` itself
    fails, lanceMetadataReader stays null and `close()` skips it, only closing 
the allocator.      



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to