Re: [PR] [SPARK-57484][SQL] Preserve row nullability after ColumnarToRowExec [spark]

via GitHub Wed, 17 Jun 2026 09:33:35 -0700


ZitingShen commented on PR #56536:
URL: https://github.com/apache/spark/pull/56536#issuecomment-4732888021


   Discussed with @sunchao:
   After digging further, we don’t think this is the right change for OSS 
Spark. Making every ColumnarToRowExec output nullable changes a very broad 
execution contract on a hot path: downstream JVM row operators now need to 
treat the physical null bitmap as authoritative even when the optimized plan 
says nullable=false. That adds null-check cost after common scan boundaries, 
changes explain/canonical/state-schema behavior, and still does not provide 
true end-to-end support for nulls under a non-null logical contract because 
Catalyst has already optimized using nullable=false. If Spark wants to harden 
this case, it should be through a narrower invariant check or a fully designed 
end-to-end semantic change with optimizer, execution, tests, and performance 
validation, not this localized row-boundary rewrite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-57484][SQL] Preserve row nullability after ColumnarToRowExec [spark]

Reply via email to