One could argue that the generated query should be rewritten; instead of having a large OR, it should insert the 5000 elements in a table and use an IN check. I don't know if this is possible in your setting.
Mihai ________________________________ From: Viggo Chen (Jira) <[email protected]> Sent: Wednesday, October 8, 2025 8:34 PM To: [email protected] <[email protected]> Subject: [jira] [Commented] (CALCITE-7213) StackOverflowError in SqlValidatorImpl with long OR/AND predicate chains [ https://issues.apache.org/jira/browse/CALCITE-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18028519#comment-18028519 ] Viggo Chen commented on CALCITE-7213: ------------------------------------- [~julianhyde] I completely agree with the core of your argument: converting all recursive algorithms in Calcite to be iterative is a massive undertaking and not something to be taken lightly. As you rightly pointed out, this isn't an isolated problem in one specific method. Your prediction is spot on. When I reduced the number of `OR` conditions from 5000 to 2000, the initial `StackOverflowError` in `performUnconditionalRewrites` disappeared, but another one surfaced in a different part of the validation process, this time within `SqlValidatorImpl$Expander` and `SqlShuttle`. New Stack Trace (with 2000 `OR` conditions): {code:java} java.lang.StackOverflowError at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.<init>(SqlShuttle.java:110) at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visitScoped(SqlValidatorImpl.java:7257) at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:54) ... (and so on) {code} I am not expecting a complete, immediate overhaul of all recursive logic. However, this brings us to a critical question from a user's perspective: How can Calcite users protect their production systems from crashing today? Do you have any suggestion? > StackOverflowError in SqlValidatorImpl with long OR/AND predicate chains > ------------------------------------------------------------------------ > > Key: CALCITE-7213 > URL: https://issues.apache.org/jira/browse/CALCITE-7213 > Project: Calcite > Issue Type: Bug > Reporter: Viggo Chen > Priority: Major > > h1. Problem Description: > Apache Calcite's SQL validation stage is vulnerable to > `java.lang.StackOverflowError` when processing queries that contain a large > number of chained logical operators (`AND`/`OR`). This occurs because the > parser generates a deeply nested, unbalanced Abstract Syntax Tree (AST) for > these predicates. Subsequent recursive traversal during the validation phase > exhausts the JVM's call stack, causing the process to fail. > This issue prevents Calcite from handling complex, often machine-generated, > SQL queries, which are prevalent in modern data systems. > Steps to Reproduce: > The issue can be reliably reproduced with a test case that generates a long > chain of `OR` conditions. The following Java code creates a query with 5000 > `OR` predicates, which is sufficient to trigger the stack overflow during the > validation step. > {code:java} > // Test case to generate the problematic SQL > final String whereClause = java.util.stream.IntStream.range(0, 5000) > .mapToObj(i -> "sal = " + i) > .collect(java.util.stream.Collectors.joining(" OR ")); > // This will throw StackOverflowError during the validation phase > sql("SELECT * FROM emp WHERE " + whereClause).ok(); {code} > Generated SQL: > {code:java} > SELECT * FROM emp WHERE sal = 0 OR sal = 1 OR sal = 2 OR ... OR sal = 4999 > {code} > h1. Stack Trace Analysis: > A typical stack trace points to a deep recursion within the > `SqlValidatorImpl` component, demonstrating that the failure happens during > the validation and semantic analysis of the expression tree. > {code:java} > java.lang.StackOverflowError > at org.apache.calcite.sql.SqlBasicCall.getKind(SqlBasicCall.java:83) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1458) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1473) > at > org.apache.calcite.sql.validate.SqlValidatorImpl.performUnconditionalRewrites(SqlValidatorImpl.java:1473) > // ... repeats thousands of times {code} > h1. Impact on System Stability: > While a query with thousands of chained predicates may be considered > suboptimal, the framework's response should be a graceful error (e.g., a > validation exception), not a catastrophic failure. A > `java.lang.StackOverflowError` is a critical, non-recoverable error that can > terminate the executing thread and potentially bring down the entire service. > This transforms a query-level issue into a severe system stability and > reliability problem. > > h1. Suggested Solutions: > I propose a two-part approach: an immediate defensive measure to improve > stability, and long-term solutions to fundamentally fix the issue. > h2. Part 1: Immediate Defensive Measure (Mitigation) > Introduce a configurable limit on the maximum depth of the AST. This acts as > a "safety fuse" to prevent stack overflows. > - Mechanism: After parsing, or at the very beginning of the validation > phase, perform a quick, non-recursive depth check of the `SqlNode` tree. > - Configuration: This depth limit should be configurable (e.g., via > `SqlParser.Config` or `SqlValidator.Config`). The default value should be set > to a safe but reasonable number (e.g., 1000). > Action: If the AST depth exceeds the configured threshold, Calcite should > immediately throw a standard `SqlParseException` or `SqlValidatorException` > with a clear error message, such as "Query exceeds maximum expression depth > limit of [N]". > This provides users with an immediate way to protect their systems from > problematic queries without waiting for a full architectural fix. > h2. Part 2: Fundamental Fixes (Long-Term Solutions) > To properly support these complex queries, the underlying cause of the deep > recursion must be addressed. > 1. Adopt Iterative Traversal (Recommended): The most robust solution is to > refactor the recursive algorithms within `SqlValidatorImpl` and other core > components to use an explicit, heap-allocated stack (e.g., > `java.util.Deque`). This converts the recursive traversal into an iterative > one, which is not constrained by the call stack depth and can handle > expression trees of any complexity. This holistically solves this class of > problems. > 2. Balanced Tree Construction: An alternative is to modify the parser's > behavior. Instead of generating a left-deep tree for chained operators like > `a OP b OP c OP d`, it could produce a balanced tree: `OP(OP(a, b), OP(c, > d))`. A balanced tree's depth grows logarithmically (`log2(N)`) with the > number of operators, which would naturally prevent stack overflows for even a > very large number of predicates. This could be a configurable parser feature. > By implementing the defensive measure from Part 1, we can provide immediate > stability. Following up with one of the fundamental fixes from Part 2 will > ensure Calcite's long-term resilience and scalability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
