One could argue that the generated query should be rewritten; instead of having 
a large OR, it should insert the 5000 elements in a table and use an IN check. 
I don't know if this is possible in your setting.

Mihai

________________________________
From: Viggo Chen (Jira) <[email protected]>
Sent: Wednesday, October 8, 2025 8:34 PM
To: [email protected] <[email protected]>
Subject: [jira] [Commented] (CALCITE-7213) StackOverflowError in 
SqlValidatorImpl with long OR/AND predicate chains


    [ 
https://issues.apache.org/jira/browse/CALCITE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18028519#comment-18028519
 ]

Viggo Chen commented on CALCITE-7213:
-------------------------------------

[~julianhyde] I completely agree with the core of your argument: converting all 
recursive algorithms in Calcite to be iterative is a massive undertaking and 
not something to be taken lightly. As you rightly pointed out, this isn't an 
isolated problem in one specific method. Your prediction is spot on. When I 
reduced the number of `OR` conditions from 5000 to 2000, the initial 
`StackOverflowError` in `performUnconditionalRewrites` disappeared, but another 
one surfaced in a different part of the validation process, this time within 
`SqlValidatorImpl$Expander` and `SqlShuttle`.
New Stack Trace (with 2000 `OR` conditions):
{code:java}
java.lang.StackOverflowError
at 
org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.<init>(SqlShuttle.java:110)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visitScoped(SqlValidatorImpl.java:7257)
at 
org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:54)
    ... (and so on) {code}
I am not expecting a complete, immediate overhaul of all recursive logic. 
However, this brings us to a critical question from a user's perspective: How 
can Calcite users protect their production systems from crashing today? Do you 
have any suggestion?

> StackOverflowError in SqlValidatorImpl with long OR/AND predicate chains
> ------------------------------------------------------------------------
>
>                 Key: CALCITE-7213
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7213
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Viggo Chen
>            Priority: Major
>
> h1. Problem Description:
> Apache Calcite's SQL validation stage is vulnerable to 
> `java.lang.StackOverflowError` when processing queries that contain a large 
> number of chained logical operators (`AND`/`OR`). This occurs because the 
> parser generates a deeply nested, unbalanced Abstract Syntax Tree (AST) for 
> these predicates. Subsequent recursive traversal during the validation phase 
> exhausts the JVM's call stack, causing the process to fail.
> This issue prevents Calcite from handling complex, often machine-generated, 
> SQL queries, which are prevalent in modern data systems.
> Steps to Reproduce:
> The issue can be reliably reproduced with a test case that generates a long 
> chain of `OR` conditions. The following Java code creates a query with 5000 
> `OR` predicates, which is sufficient to trigger the stack overflow during the 
> validation step.
> {code:java}
> // Test case to generate the problematic SQL
> final String whereClause = java.util.stream.IntStream.range(0, 5000)
>     .mapToObj(i -> "sal = " + i)
>     .collect(java.util.stream.Collectors.joining(" OR "));
> // This will throw StackOverflowError during the validation phase
> sql("SELECT * FROM emp WHERE " + whereClause).ok(); {code}
> Generated SQL:
> {code:java}
> SELECT * FROM emp WHERE sal = 0 OR sal = 1 OR sal = 2 OR ... OR sal = 4999 
> {code}
> h1. Stack Trace Analysis:
> A typical stack trace points to a deep recursion within the 
> `SqlValidatorImpl` component, demonstrating that the failure happens during 
> the validation and semantic analysis of the expression tree.
> {code:java}
> java.lang.StackOverflowError
> at org.apache.calcite.sql.SqlBasicCall.getKind(SqlBasicCall.java:83)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1458)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1473)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1473)
> // ... repeats thousands of times {code}
> h1. Impact on System Stability:
> While a query with thousands of chained predicates may be considered 
> suboptimal, the framework's response should be a graceful error (e.g., a 
> validation exception), not a catastrophic failure. A 
> `java.lang.StackOverflowError` is a critical, non-recoverable error that can 
> terminate the executing thread and potentially bring down the entire service. 
> This transforms a query-level issue into a severe system stability and 
> reliability problem.
>
> h1. Suggested Solutions:
> I propose a two-part approach: an immediate defensive measure to improve 
> stability, and long-term solutions to fundamentally fix the issue.
> h2. Part 1: Immediate Defensive Measure (Mitigation)
> Introduce a configurable limit on the maximum depth of the AST. This acts as 
> a "safety fuse" to prevent stack overflows.
>  - Mechanism: After parsing, or at the very beginning of the validation 
> phase, perform a quick, non-recursive depth check of the `SqlNode` tree.
>  - Configuration: This depth limit should be configurable (e.g., via 
> `SqlParser.Config` or `SqlValidator.Config`). The default value should be set 
> to a safe but reasonable number (e.g., 1000).
> Action: If the AST depth exceeds the configured threshold, Calcite should 
> immediately throw a standard `SqlParseException` or `SqlValidatorException` 
> with a clear error message, such as "Query exceeds maximum expression depth 
> limit of [N]".
> This provides users with an immediate way to protect their systems from 
> problematic queries without waiting for a full architectural fix.
> h2. Part 2: Fundamental Fixes (Long-Term Solutions)
> To properly support these complex queries, the underlying cause of the deep 
> recursion must be addressed.
> 1. Adopt Iterative Traversal (Recommended): The most robust solution is to 
> refactor the recursive algorithms within `SqlValidatorImpl` and other core 
> components to use an explicit, heap-allocated stack (e.g., 
> `java.util.Deque`). This converts the recursive traversal into an iterative 
> one, which is not constrained by the call stack depth and can handle 
> expression trees of any complexity. This holistically solves this class of 
> problems.
> 2. Balanced Tree Construction: An alternative is to modify the parser's 
> behavior. Instead of generating a left-deep tree for chained operators like 
> `a OP b OP c OP d`, it could produce a balanced tree: `OP(OP(a, b), OP(c, 
> d))`. A balanced tree's depth grows logarithmically (`log2(N)`) with the 
> number of operators, which would naturally prevent stack overflows for even a 
> very large number of predicates. This could be a configurable parser feature.
> By implementing the defensive measure from Part 1, we can provide immediate 
> stability. Following up with one of the fundamental fixes from Part 2 will 
> ensure Calcite's long-term resilience and scalability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to