mdashti opened a new issue, #23246:
URL: https://github.com/apache/datafusion/issues/23246

   # Describe the bug
   
   `datafusion-sql` raises `recursive`'s minimum stack size (the "red zone") 
via `StackGuard` (`datafusion/sql/src/stack.rs`) at two production sites: 
`query_to_plan` (`datafusion/sql/src/query.rs`) and the public `expr_to_sql` 
(`datafusion/sql/src/unparser/expr.rs`). Both call `StackGuard::new(256 * 
1024)`, which sets the minimum on `new()` and restores the previous value on 
`drop()`.
   
   That minimum is a **process-global** (`recursive`'s `MINIMUM_STACK_SIZE`, 
default 128 KiB). The save/restore is therefore not thread-safe. When two 
threads plan or unparse deep expressions at the same time, a guard that 
captured the 128 KiB default and drops while another thread is mid-recursion 
restores 128 KiB globally. The in-flight recursion then observes a red zone 
smaller than its per-level stack cost, `stacker` does not grow in time, and the 
OS stack overflows (the process aborts with SIGABRT).
   
   The interleaving:
   
   1. global = 128 KiB (default, no guard held)
   2. thread B: `StackGuard::new(256K)` captures `previous = 128K`, sets `256K`
   3. thread A: `StackGuard::new(256K)` captures `previous = 256K`, sets 
`256K`, begins a deep recursion that relies on `256K`
   4. thread B drops: restores `previous = 128K`
   5. thread A's next `recursive` checkpoint reads `128K`; a single heavy hop 
(e.g. the unparser's scalar-function-override path) does not fit, and the stack 
overflows
   
   Both guard sites use `256 * 1024`, so any pair of concurrent deep 
plan/unparse calls can hit this.
   
   ## To Reproduce
   
   On `x86_64` in a debug / `ci` profile (larger frames), with 
`recursive_protection` enabled (on by default via the `datafusion` crate). The 
red-zone value is load-bearing, which is easy to confirm deterministically by 
pinning it to the value a stale restore leaves:
   
   ```rust
   // `recursive::set_minimum_stack_size` = the global the `StackGuard` mutates.
   recursive::set_minimum_stack_size(128 * 1024); // what a concurrent 
StackGuard drop restores
   
   std::thread::Builder::new()
       .stack_size(2 * 1024 * 1024)
       .spawn(|| {
           // ~2000-deep array_has(...) chain, unparsed through the recursion 
entry that
           // does not re-install a StackGuard 
(Unparser::expr_to_sql_with_nesting).
           // Overflows at 128 KiB; with `256 * 1024` it returns cleanly.
       })
       .unwrap()
       .join()
       .unwrap();
   ```
   
   In production the 128 KiB state in step 1 is not reached by an explicit 
call: it is what a concurrent `StackGuard` on another thread restores on drop 
(the interleaving above) while this recursion is in flight. So a realistic 
trigger is two threads, one repeatedly unparsing/planning shallow expressions 
(churning the guard) and one unparsing/planning a deep expression, on `x86_64`.
   
   ## Expected behavior
   
   Concurrent deep planning/unparsing should not overflow the stack. A red zone 
that one thread is relying on should not be lowerable by another thread's 
unrelated `StackGuard` drop.
   
   ## Additional context
   
   - Surfaced as a flaky `cargo test hash collisions (amd64)` failure in 
#23198: the deep-unparse regression test overflowed because sibling unparser 
tests in the same process churned the global red zone as their guards dropped. 
That PR isolates the test in its own binary so the suite is stable, but it does 
not fix the underlying race.
   - Related: #23056 (the unparser overflow `StackGuard` was added for) and 
#13310 (the planner `StackGuard`).
   - `recursive`'s `MINIMUM_STACK_SIZE` / `STACK_ALLOC_SIZE` are global 
atomics, so any scoped, restore-on-drop mutation of them is inherently racy 
across threads.
   - A clean fix likely means not scope-mutating the global at all (for example 
wrapping the deep recursion in an explicit `stacker::maybe_grow` with a local 
red zone), since the alternative of raising the global and never restoring it 
leaks the larger red zone into every other `recursive` user in the process.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to