sejal-gupta-ksolves opened a new pull request, #16768: URL: https://github.com/apache/iceberg/pull/16768
### Closes: #16758 ### Problem When downstream query engines (such as StarRocks, Trino, or Spark) cancel or abort a REST table scan early due to client disconnects, timeouts, or query limits, they trigger the cleanup sequence on the outer execution container. In Apache Iceberg, `ScanTaskIterable.close()` was implemented as an empty no-op method. Because this outer `close()` call failed to cascade the shutdown signal to the underlying data structures: - The internal `shutdown` state atomic flag remained `false`. - Background `PlanTaskWorker` threads continued running indefinitely. - Once the internal `taskQueue` reached its `1000` item capacity limit, all active worker threads became permanently deadlocked inside `offerWithTimeout()`, leading to thread pool exhaustion on the engine coordinator side. ### Solution 1. **Implemented State Tracking and Cleanup:** Added thread-safe execution barriers inside `ScanTaskIterable.close()` utilizing `shutdown.compareAndSet(false, true)`. 2. **Queue Eviction Matrix:** Updated the close block to explicitly flush `taskQueue`, `planTasks`, and `initialFileScanTasks` lists upon termination. This allows background threads stuck in an `offer` wait cycle to instantly unblock, evaluate the flipped shutdown state, and exit gracefully. 3. **Decoupled Iterator Lifecycle:** Refactored the internal `ScanTasksIterator.close()` block to eliminate redundant code duplication, rewriting it to delegate its cleanup tasks straight up to `ScanTaskIterable.this.close()`. This ensures unified thread termination safety across all potential entry points. 4. **Regression Test Addition:** Designed and integrated `TestScanTaskIterableLeak` under the `org.apache.iceberg.rest` test package, proving that active planning thread allocations successfully scale back down to `0` upon premature termination. ### Verification Testing ```bash # 1. Clean format code using Spotless rules ./gradlew spotlessApply # 2. Run static quality analysis lint checks on modified packages ./gradlew :iceberg-core:compileJava :iceberg-core:compileTestJava # 3. Verify the core build pass and execute the regression test case ./gradlew :iceberg-core:test --tests "org.apache.iceberg.rest.TestScanTaskIterableLeak" --info -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
