kosiew commented on code in PR #23058:
URL: https://github.com/apache/datafusion/pull/23058#discussion_r3451348221
##########
datafusion/sql/src/unparser/expr.rs:
##########
@@ -3278,6 +3302,61 @@ mod tests {
Ok(())
}
+ /// Regression test for https://github.com/apache/datafusion/issues/23056
+ ///
+ /// Deeply-nested expressions whose unparse path routes through scalar
+ /// function arguments and dialect scalar-function overrides used to
+ /// overflow the OS stack even with `recursive_protection` enabled,
+ /// because the per-level stack cost of those paths exceeds the default
+ /// `recursive` red zone and the unparser installed no [`StackGuard`].
+ ///
+ /// This test only asserts the protected behavior, so it is gated on the
+ /// `recursive_protection` feature. Without that feature the unparser is
+ /// not stack-safe by design and a deep enough expression will overflow.
+ #[cfg(feature = "recursive_protection")]
+ #[test]
+ fn test_deeply_nested_expr_does_not_overflow_stack() {
+ // Far deeper than the ~60 levels that overflow without protection, but
+ // bounded so the trampoline's heap stacks stay reasonable in debug.
+ const DEPTH: usize = 2_000;
+
+ // Run on an explicit, realistically-sized thread stack. The work is
+ // performed on a spawned thread so an overflow (in the unfixed code)
+ // aborts the process and fails the test deterministically rather than
+ // depending on the harness thread's stack size.
+ let handle = std::thread::Builder::new()
+ .stack_size(2 * 1024 * 1024)
+ .spawn(|| {
+ // 1. Linear chain through a dialect scalar-function override:
+ // array_has(array_has(... array_has(col, 'x') ...), 'x').
+ // PostgreSqlDialect unparses array_has via
array_has_to_sql_any,
+ // which recurses back into the unparser for each argument.
+ let mut nested_fn: Expr = col("c");
+ for _ in 0..DEPTH {
+ nested_fn = array_has(nested_fn, lit("x"));
+ }
+ let pg = PostgreSqlDialect {};
+ Unparser::new(&pg)
+ .expr_to_sql(&nested_fn)
+ .expect("deeply nested scalar function should unparse");
+
+ // 2. Linear chain of plain binary operators, exercising the
+ // inner -> inner recursion on the default dialect.
+ let mut nested_binary: Expr = col("c");
+ for _ in 0..DEPTH {
+ nested_binary = nested_binary + lit(1);
+ }
+ Unparser::default()
Review Comment:
Nice regression test. It covers the default unparse path well, but it does
not exercise the pretty-print path that now depends on
`remove_unnecessary_nesting` being recursive.
Would it make sense to also unparse the deeply nested binary expression with
`Unparser::default().with_pretty(true)`? That would help lock down the second
recursion site that this PR fixes and make future regressions a little easier
to catch.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]