The GitHub Actions job "Required Checks" on texera.git/fix/5143-iceberg-closeable-leak has succeeded. Run started by GitHub user mengw15 (triggered by mengw15).
Head commit for run: a6911dac33201f0d62db2915c939fd3429095c98 / mengw15 <[email protected]> fix: close CloseableIterable owners in Iceberg read paths `IcebergUtil.readDataFileAsIterator` and five sibling sites in `IcebergDocument` shared the same anti-pattern: `closeableIterable.iterator().asScala` returned a bare Scala iterator with no reference to the parent `CloseableIterable`. Under `S3FileIO` each call leaked one `S3InputStream` (kept open until GC) plus one slot of the AWS SDK's default 50-slot Apache HTTP connection pool; after ~50 reads any new S3 read blocked indefinitely on `acquireConnection` until JVM restart. Introduce `CloseableScalaIterator[T]` (`Iterator[T] with AutoCloseable`, idempotent `close()`) that wraps a `CloseableIterable[T]` and propagates `close()` to the parent. Change `readDataFileAsIterator` to return this wrapper. Update the `IcebergDocument` read iterator to track the close handle in a sibling `AutoCloseable` field (needed because `Iterator.drop(n)` returns a bare iterator that loses the wrapper type) and close it on file-switch, on exhaustion, and on caller-imposed `until` cap. Wrap the four eagerly-consumed `planFiles()` call sites (`getCount`, `seekToUsableFile`, `getTableStatistics`, `asInputStream`) in `Using.resource` so the metadata-side `CloseableIterable<FileScanTask>` is released promptly. Known limitation out of scope here: if a caller of `IcebergDocument.get()` / `getRange()` / `getAfter()` stops iterating before `hasNext` returns `false`, the LAST file's wrapper still leaks until GC. Fixing that requires changing the public `Iterator[T]` return type on `VirtualDocument` to `Iterator[T] with AutoCloseable` and updating every caller — best done as a separate refactor. Closes #5143. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Report URL: https://github.com/apache/texera/actions/runs/26275972362 With regards, GitHub Actions via GitBox
