[
https://issues.apache.org/jira/browse/SPARK-56826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080206#comment-18080206
]
Shekhar Prasad Rajak commented on SPARK-56826:
----------------------------------------------
I can work on this ticket.
> PushVariantIntoScan throws NPE / NoSuchElementException when invariants from
> upstream rules don't hold
> ------------------------------------------------------------------------------------------------------
>
> Key: SPARK-56826
> URL: https://issues.apache.org/jira/browse/SPARK-56826
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.1.1
> Reporter: Shekhar Prasad Rajak
> Priority: Major
> Fix For: 4.2.0
>
>
> org.apache.spark.sql.execution.datasources.PushVariantIntoScan
> (RequestedVariantField companion) makes two assumptions about its inputs that
> hold under the default optimizer pipeline but are not validated locally:
> 1. VariantGet.path.eval() is non-null (relied on by path.eval().toString)
> 2. VariantGet.timeZoneId and Cast.timeZoneId are Some(_) (relied on by .get)
> logs :
> [P1] threw java.lang.NullPointerException:
> Cannot invoke "Object.toString()" because the return value of
> "org.apache.spark.sql.catalyst.expressions.Expression.eval(...)" is null
> [P2] threw java.util.NoSuchElementException: None.get
> [P6] threw java.util.NoSuchElementException: None.get
> Expected Behaviour
> RequestedVariantField.apply(VariantGet) and RequestedVariantField.apply(Cast)
> should either:
> • Return a sensible RequestedVariantField by treating missing inputs
> defensively (e.g. fall back to SQLConf.get.sessionLocalTimeZone for missing
> tz; throw IllegalStateException with a clear message for null path), or
> • Be guarded at the call sites in collectRequestedFields / rewriteExpr
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]