rahil-c commented on code in PR #17904:
URL: https://github.com/apache/hudi/pull/17904#discussion_r2761067889


##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkBasicSchemaEvolution.scala:
##########
@@ -20,32 +20,118 @@
 package org.apache.spark.sql.execution.datasources.parquet
 
 import org.apache.hudi.SparkAdapterSupport.sparkAdapter
-
+import org.apache.hudi.common.model.HoodieFileFormat
 import org.apache.spark.sql.HoodieSchemaUtils
 import org.apache.spark.sql.catalyst.expressions.UnsafeProjection
-import org.apache.spark.sql.types.StructType
+import org.apache.spark.sql.execution.datasources.SparkSchemaTransformUtils
+import org.apache.spark.sql.types.{ArrayType, DataType, MapType, StructField, 
StructType}
 
 
 /**
- * Intended to be used just with HoodieSparkParquetReader to avoid any 
java/scala issues
+ * Generic schema evolution handler for different file formats.
+ * Supports Parquet (default), and Lance currently.

Review Comment:
   Have pushed recent change here for chaining idea for separating null 
projection here 
https://github.com/apache/hudi/pull/17904/changes/5196402c7466be0f1dd7de66b6aa653cf44f2e09
   
   I synced with tim and the main feedback is to see if we can further improve 
to avoid doing any case switches related to the file format in the 
`SparkBasicSchemaEvolution` and other schema related classes. Ideally we should 
try to see if we can move any file format specific stuff in the respective 
callers (in this case the our file format reader related classes).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to