[jira] [Commented] (SPARK-57484) ColumnarToRowExec can crash codegen when columnar data contains unexpected nulls

Ziting Shen (Jira) Wed, 17 Jun 2026 09:34:15 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-57484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089710#comment-18089710
 ]


Ziting Shen commented on SPARK-57484:
-------------------------------------

Discussed with [@sunchao|https://github.com/sunchao]:
After digging further, we don’t think this is the right change for OSS Spark. 
Making every ColumnarToRowExec output nullable changes a very broad execution 
contract on a hot path: downstream JVM row operators now need to treat the 
physical null bitmap as authoritative even when the optimized plan says 
nullable=false. That adds null-check cost after common scan boundaries, changes 
explain/canonical/state-schema behavior, and still does not provide true 
end-to-end support for nulls under a non-null logical contract because Catalyst 
has already optimized using nullable=false. If Spark wants to harden this case, 
it should be through a narrower invariant check or a fully designed end-to-end 
semantic change with optimizer, execution, tests, and performance validation, 
not this localized row-boundary rewrite.

> ColumnarToRowExec can crash codegen when columnar data contains unexpected 
> nulls
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-57484
>                 URL: https://issues.apache.org/jira/browse/SPARK-57484
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 4.0.0, 3.5.8
>            Reporter: Ziting Shen
>            Assignee: Ziting Shen
>            Priority: Major
>              Labels: pull-request-available
>
> ColumnarToRowExec may materialize null values from a columnar batch even when 
> the planned output attribute is non-nullable, so downstream row codegen can 
> skip null checks and fail with errors such as UTF8String.getBaseObject() NPEs.
> One observed failing plan had this shape:
> HashAggregate
>   HashAggregate
>     ...
>       ColumnarToRow
>         Scan parquet ...
> In that case the Parquet reader produced a physical null for a column whose 
> planned output attribute was non-nullable, and downstream generated row code 
> failed with a UTF8String.getBaseObject() NullPointerException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-57484) ColumnarToRowExec can crash codegen when columnar data contains unexpected nulls

Reply via email to