LantaoJin opened a new pull request, #69:
URL: https://github.com/apache/datafusion-java/pull/69

   ## Which issue does this PR close?
   
   - Closes #68 .
   
   ## Rationale for this change
   
   A long-running `DataFrame.collect(allocator)` or 
`DataFrame.executeStream(allocator)` call blocks the calling Java thread for 
the full duration of the query. `Thread.interrupt()` is a no-op (the thread is 
parked inside `runtime().block_on(...)`), and there is no way to free the 
in-flight query's native resources early. For any embedder running request 
timeouts, user-cancel actions, or node-shutdown drains, this is a hard 
operational gap.
   
   ## What changes are included in this PR?
   
   The cancellation we ship lives on the session, not on the DataFrame. A token 
is allocated by the session, passed to `collect` / `executeStream`, and fired 
from any thread. We deliberately do *not* add `DataFrame.cancel()`: a 
`DataFrame` is a lazy plan that can be executed concurrently from multiple 
threads, so a per-DataFrame cancel verb has ambiguous semantics. The token is 
the primitive; a Spark-style `ctx.addTag(name)` / `ctx.cancelTag(name)` sugar 
layer can land in a follow-up.
   
   **Native shape:** `tokio_util::sync::CancellationToken` per token, owned by 
a process-global registry keyed by an opaque `u64` ID. JNI handlers look up by 
ID under a `Mutex<HashMap>` and clone the `Arc` out before any blocking work. 
The cloned `Arc` keeps the inner token alive for the borrow's lifetime, so an 
in-flight `collect()` future that already holds a clone keeps working through a 
concurrent close. `close()` removes the registry entry; the underlying token 
drops only when the last `Arc` clone goes away.
   
   **Race safety:** `CancellationToken.nativeHandle` is `final AtomicLong`. 
`cancel()` / `isCancelled()` read via `get()`; `close()` uses `getAndSet(0L)` 
so only the winning thread issues `closeToken`. Combined with the registry on 
the native side, a concurrent close + cancel + query-start race produces at 
worst a clean `IllegalStateException` from a missing-ID lookup -- never a 
use-after-free of a freed `Box`. The registry pattern is the same scaffolding 
upstream issue #40 calls for across all handle types; cancellation tokens get 
it first because they are designed to be fired from a thread that does not own 
them, so the race window is the widest of any handle.
   
   ## Are these changes tested?
   
   Yes -- 19 new tests across `CancellationTokenTest` and 
`DataFrameCancellationTest`.
   
   ## Are there any user-facing changes?
   
   Yes -- purely additive. New `CancellationToken` class, one new method on 
`SessionContext`, two new overloads on `DataFrame`. No API removals, no 
deprecations, no behaviour change for existing callers. New Cargo dependency: 
`tokio-util = { version = "0.7", features = ["rt"] }`; already a transitive 
dep, so no new crate in the build, just a feature flag added.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to