rahil-c commented on code in PR #18304:
URL: https://github.com/apache/hudi/pull/18304#discussion_r2955637177
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkLanceReader.java:
##########
@@ -63,19 +72,51 @@ public class HoodieSparkLanceReader implements
HoodieSparkFileReader {
// number of rows to read
private static final int DEFAULT_BATCH_SIZE = 512;
private final StoragePath path;
+ private final BufferAllocator metadataAllocator;
+ private final LanceFileReader lanceMetadataReader;
+ private final Schema arrowSchema;
public HoodieSparkLanceReader(StoragePath path) {
this.path = path;
+ metadataAllocator = HoodieArrowAllocator.newChildAllocator(
+ getClass().getSimpleName() + "-metadata-" + path.getName(),
LANCE_METADATA_ALLOCATOR_SIZE);
+ try {
+ lanceMetadataReader = LanceFileReader.open(path.toString(),
metadataAllocator);
+ arrowSchema = lanceMetadataReader.schema();
+ } catch (IOException e) {
+ throw new HoodieException("Failed to create lanceMetadataReader: " +
path, e);
+ }
Review Comment:
@wombatu-kun @voonhous
The issue is, if `LanceFileReader.open` succeeds but `schema()` throws, the
opened lance file reader is never closed and has a reference to the allocator.
Also, `non-IOException` exceptions bypass cleanup entirely so thinking we
should catch a general `Exeception`.
```
try {
lanceMetadataReader = LanceFileReader.open(path.toString(),
metadataAllocator);
arrowSchema = lanceMetadataReader.schema();
} catch (Exception e) {
close();
throw new HoodieException("Failed to create lanceMetadataReader: " +
path, e);
}
```
Since `lanceMetadataReader` is assigned before schema() is called, and
`close()` already handles both
resources in the correct order with null checks, we can just call `close()`
directly. If `open()` itself
fails, lanceMetadataReader stays null and `close()` skips it, only closing
the allocator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]