schenksj opened a new pull request, #4531: URL: https://github.com/apache/datafusion-comet/pull/4531
## Which issue does this PR close? Closes #4526. ## Rationale for this change A left-deep chain of N associative boolean operands serializes to a proto nested N levels deep. With N greater than protobuf's default recursion limit (100), the message overflows when the serialized plan is re-parsed -- on the JVM via `OperatorOuterClass.Operator.parseFrom` (e.g. `findShuffleScanIndices` / explain) and in the Rust `prost` decoder -- so an otherwise-supported query fails. Comet evaluates `And`/`Or` vectorially (both sides always evaluated, no row-level short-circuit), so the chains are fully associative and safe to rebalance. This is a standalone fix; it was surfaced while working on the Delta Lake contrib integration (Delta data-skipping builds deep conjunctions), so prioritizing it helps that effort, but it applies to any wide boolean predicate. ## What changes are included in this PR? - `QueryPlanSerde.flattenAssociative` flattens an associative `And`/`Or` chain into its leaf operands. - `QueryPlanSerde.createBalancedBinaryExpr` rebuilds the operands as a balanced `O(log n)`-depth `BinaryExpr` tree. - `CometAnd` / `CometOr` are routed through these instead of the left-deep `createBinaryExpr`. The rebalancing is semantically identical -- it only changes the proto's shape. ## How are these changes tested? New test in `CometExpressionSuite`: projects a 200-deep AND chain and a 200-deep OR chain (distinct literals; `>`/`<` so neither `CombineFilters` nor `OptimizeIn` collapses them) and asserts Comet executes them natively with correct results. The test fails on `main` with `InvalidProtocolBufferException: Protocol message had too many levels of nesting` and passes with this change. Full `CometExpressionSuite` passes (124/0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
