mikhailnik-db commented on code in PR #55986:
URL: https://github.com/apache/spark/pull/55986#discussion_r3275466166
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveWithCTE.scala:
##########
@@ -284,10 +284,19 @@ object ResolveWithCTE extends Rule[LogicalPlan] {
// This is a non-recursive reference to a definition.
case ref: CTERelationRef if !ref.resolved =>
cteDefMap.get(ref.cteId).map { cteDef =>
- // cteDef is certainly resolved, otherwise it would not have been in
the map.
- CTERelationRef(
- cteDef.id, cteDef.resolved, cteDef.output, cteDef.isStreaming,
maxRows = cteDef.maxRows,
+ // Wait for SQLFunctionExpression placeholders inside the CTE body
to be inlined by
+ // ResolveSQLFunctions before snapshotting the schema. The
placeholder hard-codes
+ // nullable = true, so capturing cteDef.output while one is still
present would freeze
+ // incorrect nullability into CTERelationRef.output.
+ if (cteDef.containsPattern(SQL_FUNCTION_EXPRESSION)) {
Review Comment:
The plan is to eagerly resolve all SQL UDFs so this won't be an issue in the
single-pass analyzer. The nullability for attributes will be derived from the
resolved `SQLScalarFunction`.
In other words, with this fix, there will be parity in behavior
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]