cloud-fan commented on code in PR #56238:
URL: https://github.com/apache/spark/pull/56238#discussion_r3334592405


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##########
@@ -327,7 +327,25 @@ object TableOutputResolver extends SQLConfHelper with 
Logging {
             tableName, newColPath.quoted
           )
         }
-        Some(applyColumnMetadata(defaultExpr.get, expectedCol))
+        // Apply CHAR/VARCHAR length check to the default expression so that 
non-foldable

Review Comment:
   This runtime enforcement is only added to the by-name default-fill path. The 
by-position fill in `resolveColumnsByPosition` (around lines 461-470) is 
structurally identical — `getDefaultValueExprOrNullLit` + `applyColumnMetadata` 
with no `stringLengthCheck` — and is reached for a by-position INSERT under V2 
schema evolution (RECURSE; see `Analyzer.scala`'s `coerceInsertNestedTypes && 
schemaEvolutionEnabled` branch) when a trailing CHAR/VARCHAR column is omitted. 
So an oversized non-foldable default there would still be written without 
EXCEED_LIMIT_LENGTH — the same bypass this PR closes for by-name. I traced both 
sites and the matched-column analogues (`checkField`/`checkUpdate` both wrap 
with `stringLengthCheck`) but didn't run an end-to-end schema-evolution write — 
can you confirm the by-position path isn't enforced elsewhere? If not, I'd 
extract a small shared helper (e.g. `applyDefaultWithLengthCheck(expectedCol, 
conf)`) called from both default-fill sites, and add a t
 est covering the by-position/RECURSE path.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##########
@@ -327,7 +327,25 @@ object TableOutputResolver extends SQLConfHelper with 
Logging {
             tableName, newColPath.quoted
           )
         }
-        Some(applyColumnMetadata(defaultExpr.get, expectedCol))
+        // Apply CHAR/VARCHAR length check to the default expression so that 
non-foldable
+        // defaults (e.g. current_user()) that exceed the column length are 
caught at runtime.
+        // Uses getRawType so it works for both V1 and V2 tables.
+        val checkedExpr = defaultExpr.map { expr =>
+          val rawType = CharVarcharUtils.getRawType(expectedCol.metadata)
+            .getOrElse(expectedCol.dataType)
+          if (!conf.charVarcharAsString && 
CharVarcharUtils.hasCharVarchar(rawType)) {
+            expr match {
+              case a: Alias =>
+                a.copy(child = CharVarcharUtils.stringLengthCheck(a.child, 
rawType))(

Review Comment:
   This branch carefully preserves 
`exprId`/`qualifier`/`explicitMetadata`/`nonInheritableMetadataKeys`, but 
`applyColumnMetadata` on the next line strips the Alias down to its child and 
re-wraps it in a fresh `Alias(stripAlias, name)(explicitMetadata = 
Some(requiredMetadata))`, so all of that is discarded. The `case other` branch 
is also dead — `getDefaultValueExprOrNullLit` always returns an `Alias`. The 
whole block can collapse to wrapping the default's child with 
`stringLengthCheck`, which folds naturally into the shared helper suggested 
above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to