Re: [PR] [SPARK-53482][SQL] MERGE INTO support for when source has less nested field than target [spark]

via GitHub Tue, 28 Oct 2025 14:35:29 -0700


szehon-ho commented on code in PR #52347:
URL: https://github.com/apache/spark/pull/52347#discussion_r2471112688



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala:
##########
@@ -38,6 +39,17 @@ import org.apache.spark.sql.types.{ArrayType, DataType, 
DecimalType, IntegralTyp
 
 object TableOutputResolver extends SQLConfHelper with Logging {
 
+  object DefaultValueFillMode extends Enumeration {

Review Comment:
   Unfortunately, we need to distinguish between filling top level defaults and 
nested defaults within structs.  
   
   1. Normal V2Writes does not expect nested source nested type coercion.  For 
example, write into a target dataframe from a source dataframe that has a 
struct column with less fields does not work today. 
   
    This goes through Analyzer.ResolveOutputRelation, which calls  
resolveOutputColumns() => reorderColumnsByName() => 
resolveStruct/Array/MapType()
   
   
   2. RowLevelOperations , in particular Merge Into, want recursive support to 
coerce source struct columns with less fields than target column.
   
   This goes through resolveUpdate => resolveStruct/Array/MapType()
   
   So hence, we need a three-way enum here to distinguish the three cases 
(none, first-level, recurse).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-53482][SQL] MERGE INTO support for when source has less nested field than target [spark]

Reply via email to