sejal-gupta-ksolves opened a new pull request, #16768:
URL: https://github.com/apache/iceberg/pull/16768

   ### Closes: #16758 
   
   ### Problem
   When downstream query engines (such as StarRocks, Trino, or Spark) cancel or 
abort a REST table scan early due to client disconnects, timeouts, or query 
limits, they trigger the cleanup sequence on the outer execution container. 
   
   In Apache Iceberg, `ScanTaskIterable.close()` was implemented as an empty 
no-op method. Because this outer `close()` call failed to cascade the shutdown 
signal to the underlying data structures:
   - The internal `shutdown` state atomic flag remained `false`.
   - Background `PlanTaskWorker` threads continued running indefinitely.
   - Once the internal `taskQueue` reached its `1000` item capacity limit, all 
active worker threads became permanently deadlocked inside 
`offerWithTimeout()`, leading to thread pool exhaustion on the engine 
coordinator side.
   
   ### Solution
   1. **Implemented State Tracking and Cleanup:** Added thread-safe execution 
barriers inside `ScanTaskIterable.close()` utilizing 
`shutdown.compareAndSet(false, true)`.
   2. **Queue Eviction Matrix:** Updated the close block to explicitly flush 
`taskQueue`, `planTasks`, and `initialFileScanTasks` lists upon termination. 
This allows background threads stuck in an `offer` wait cycle to instantly 
unblock, evaluate the flipped shutdown state, and exit gracefully.
   3. **Decoupled Iterator Lifecycle:** Refactored the internal 
`ScanTasksIterator.close()` block to eliminate redundant code duplication, 
rewriting it to delegate its cleanup tasks straight up to 
`ScanTaskIterable.this.close()`. This ensures unified thread termination safety 
across all potential entry points.
   4. **Regression Test Addition:** Designed and integrated 
`TestScanTaskIterableLeak` under the `org.apache.iceberg.rest` test package, 
proving that active planning thread allocations successfully scale back down to 
`0` upon premature termination.
   
   ### Verification Testing
   
   ```bash
   # 1. Clean format code using Spotless rules
   ./gradlew spotlessApply
   
   # 2. Run static quality analysis lint checks on modified packages
   ./gradlew :iceberg-core:compileJava :iceberg-core:compileTestJava
   
   # 3. Verify the core build pass and execute the regression test case
   ./gradlew :iceberg-core:test --tests 
"org.apache.iceberg.rest.TestScanTaskIterableLeak" --info


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to