[
https://issues.apache.org/jira/browse/SPARK-57484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089710#comment-18089710
]
Ziting Shen commented on SPARK-57484:
-------------------------------------
Discussed with [@sunchao|https://github.com/sunchao]:
After digging further, we don’t think this is the right change for OSS Spark.
Making every ColumnarToRowExec output nullable changes a very broad execution
contract on a hot path: downstream JVM row operators now need to treat the
physical null bitmap as authoritative even when the optimized plan says
nullable=false. That adds null-check cost after common scan boundaries, changes
explain/canonical/state-schema behavior, and still does not provide true
end-to-end support for nulls under a non-null logical contract because Catalyst
has already optimized using nullable=false. If Spark wants to harden this case,
it should be through a narrower invariant check or a fully designed end-to-end
semantic change with optimizer, execution, tests, and performance validation,
not this localized row-boundary rewrite.
> ColumnarToRowExec can crash codegen when columnar data contains unexpected
> nulls
> --------------------------------------------------------------------------------
>
> Key: SPARK-57484
> URL: https://issues.apache.org/jira/browse/SPARK-57484
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.0.0, 3.5.8
> Reporter: Ziting Shen
> Assignee: Ziting Shen
> Priority: Major
> Labels: pull-request-available
>
> ColumnarToRowExec may materialize null values from a columnar batch even when
> the planned output attribute is non-nullable, so downstream row codegen can
> skip null checks and fail with errors such as UTF8String.getBaseObject() NPEs.
> One observed failing plan had this shape:
> HashAggregate
> HashAggregate
> ...
> ColumnarToRow
> Scan parquet ...
> In that case the Parquet reader produced a physical null for a column whose
> planned output attribute was non-nullable, and downstream generated row code
> failed with a UTF8String.getBaseObject() NullPointerException.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]