Re: [PR] feat: Lance schema evolution (add column, type promotion) [hudi]

via GitHub Tue, 03 Feb 2026 07:17:27 -0800


the-other-tim-brown commented on code in PR #17904:
URL: https://github.com/apache/hudi/pull/17904#discussion_r2759600288



##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkBasicSchemaEvolution.scala:
##########
@@ -20,32 +20,118 @@
 package org.apache.spark.sql.execution.datasources.parquet
 
 import org.apache.hudi.SparkAdapterSupport.sparkAdapter
-
+import org.apache.hudi.common.model.HoodieFileFormat
 import org.apache.spark.sql.HoodieSchemaUtils
 import org.apache.spark.sql.catalyst.expressions.UnsafeProjection
-import org.apache.spark.sql.types.StructType
+import org.apache.spark.sql.execution.datasources.SparkSchemaTransformUtils
+import org.apache.spark.sql.types.{ArrayType, DataType, MapType, StructField, 
StructType}
 
 
 /**
- * Intended to be used just with HoodieSparkParquetReader to avoid any 
java/scala issues
+ * Generic schema evolution handler for different file formats.
+ * Supports Parquet (default), and Lance currently.

Review Comment:
   I will use my same response and you let me know where we are lacking 
clarity. My preference is to keep these separate and make it composable. Then 
future file formats can also leverage these same components as needed. If that 
is not possible with spark, then just let me know. 
   
   Is it possible to chain projections or is it simply not possible?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: Lance schema evolution (add column, type promotion) [hudi]

Reply via email to