EpsilonPrime opened a new pull request, #11278: URL: https://github.com/apache/incubator-gluten/pull/11278
Summary This refactors how output schema conformance (nullability and type casting) is enforced for the ClickHouse backend. Previously, an output_schema field was added to the RelRoot proto message, and the C++ parser would implicitly add a final projection step if types didn't match. This approach was non-standard and made the type conversion invisible in the plan. Now, when the expected output schema differs from the child plan's output (e.g., in union operations), an explicit ProjectRel with cast expressions is added to the Substrait plan on the Spark side. This makes the type enforcement visible in the plan and follows standard Substrait conventions. Changes - Add createOutputCastProjectRel() in WholeStageTransformer to generate a ProjectRel with casts when needed - Remove output_schema field from RelRoot proto message - Remove outputSchema parameter from PlanBuilder and PlanNode - Remove implicit type conversion logic from ClickHouse's SerializedPlanParser::adjustOutput() - Remove needOutputSchemaForPlan() from BackendSettingsApi and CHBackend Benefits - Explicit over implicit: Type conversions are visible as a ProjectRel in the plan - Standard Substrait: No longer using a Gluten-specific extension to RelRoot - Simpler native code: ClickHouse parses the plan without special post-processing - Better debugging: The plan clearly shows where casts occur Test Plan - Verify ClickHouse union tests pass (issue-1874 regression tests) - Verify nullable column handling in union operations - Run ClickHouse TPCH test suite -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
