nastra opened a new pull request, #15439:
URL: https://github.com/apache/iceberg/pull/15439

   This uses a similar approach to what we do in the `RESTSessionCatalog` with 
the `FileIOTracker` by wrapping the `RESTTableScan` in a `WeakReference` and 
close the attached `FileIO` instance when the `RESTTableScan` object is 
garbage-collected. 
   I have verified that this works in combination of 
https://github.com/apache/iceberg/pull/15368, where the `fileIOForPlanId` 
wasn't closed properly without this fix here as can be seen below:
   ```
   6/02/25 09:25:19 WARN ResolvingFileIO: Unclosed ResolvingFileIO instance 
created by:
        org.apache.iceberg.io.ResolvingFileIO.<init>(ResolvingFileIO.java:85)
        
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
        
java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
        
java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        
java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
        
java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
        
org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:51)
        
org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:64)
        org.apache.iceberg.CatalogUtil.loadFileIO(CatalogUtil.java:401)
        
org.apache.iceberg.rest.RESTTableScan.fileIOForPlanId(RESTTableScan.java:202)
        
org.apache.iceberg.rest.RESTTableScan.planTableScan(RESTTableScan.java:180)
        org.apache.iceberg.rest.RESTTableScan.planFiles(RESTTableScan.java:163)
        org.apache.iceberg.BatchScanAdapter.planFiles(BatchScanAdapter.java:125)
        
org.apache.iceberg.spark.source.SparkPartitioningAwareScan.tasks(SparkPartitioningAwareScan.java:185)
        
org.apache.iceberg.spark.source.SparkPartitioningAwareScan.taskGroups(SparkPartitioningAwareScan.java:213)
        
org.apache.iceberg.spark.source.SparkPartitioningAwareScan.outputPartitioning(SparkPartitioningAwareScan.java:115)
        
org.apache.spark.sql.execution.datasources.v2.V2ScanPartitioningAndOrdering$$anonfun$partitioning$1.applyOrElse(V2ScanPartitioningAndOrdering.scala:45)
        
org.apache.spark.sql.execution.datasources.v2.V2ScanPartitioningAndOrdering$$anonfun$partitioning$1.applyOrElse(V2ScanPartitioningAndOrdering.scala:43)
        
org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:491)
   ```
   
   I'm currently checking to see how to properly test this in 
`TestRESTScanPlanning`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to