cloud-fan commented on code in PR #56238:
URL: https://github.com/apache/spark/pull/56238#discussion_r3334592405
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##########
@@ -327,7 +327,25 @@ object TableOutputResolver extends SQLConfHelper with
Logging {
tableName, newColPath.quoted
)
}
- Some(applyColumnMetadata(defaultExpr.get, expectedCol))
+ // Apply CHAR/VARCHAR length check to the default expression so that
non-foldable
Review Comment:
This runtime enforcement is only added to the by-name default-fill path. The
by-position fill in `resolveColumnsByPosition` (around lines 461-470) is
structurally identical — `getDefaultValueExprOrNullLit` + `applyColumnMetadata`
with no `stringLengthCheck` — and is reached for a by-position INSERT under V2
schema evolution (RECURSE; see `Analyzer.scala`'s `coerceInsertNestedTypes &&
schemaEvolutionEnabled` branch) when a trailing CHAR/VARCHAR column is omitted.
So an oversized non-foldable default there would still be written without
EXCEED_LIMIT_LENGTH — the same bypass this PR closes for by-name. I traced both
sites and the matched-column analogues (`checkField`/`checkUpdate` both wrap
with `stringLengthCheck`) but didn't run an end-to-end schema-evolution write —
can you confirm the by-position path isn't enforced elsewhere? If not, I'd
extract a small shared helper (e.g. `applyDefaultWithLengthCheck(expectedCol,
conf)`) called from both default-fill sites, and add a t
est covering the by-position/RECURSE path.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##########
@@ -327,7 +327,25 @@ object TableOutputResolver extends SQLConfHelper with
Logging {
tableName, newColPath.quoted
)
}
- Some(applyColumnMetadata(defaultExpr.get, expectedCol))
+ // Apply CHAR/VARCHAR length check to the default expression so that
non-foldable
+ // defaults (e.g. current_user()) that exceed the column length are
caught at runtime.
+ // Uses getRawType so it works for both V1 and V2 tables.
+ val checkedExpr = defaultExpr.map { expr =>
+ val rawType = CharVarcharUtils.getRawType(expectedCol.metadata)
+ .getOrElse(expectedCol.dataType)
+ if (!conf.charVarcharAsString &&
CharVarcharUtils.hasCharVarchar(rawType)) {
+ expr match {
+ case a: Alias =>
+ a.copy(child = CharVarcharUtils.stringLengthCheck(a.child,
rawType))(
Review Comment:
This branch carefully preserves
`exprId`/`qualifier`/`explicitMetadata`/`nonInheritableMetadataKeys`, but
`applyColumnMetadata` on the next line strips the Alias down to its child and
re-wraps it in a fresh `Alias(stripAlias, name)(explicitMetadata =
Some(requiredMetadata))`, so all of that is discarded. The `case other` branch
is also dead — `getDefaultValueExprOrNullLit` always returns an `Alias`. The
whole block can collapse to wrapping the default's child with
`stringLengthCheck`, which folds naturally into the shared helper suggested
above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]