the-other-tim-brown commented on code in PR #17904:
URL: https://github.com/apache/hudi/pull/17904#discussion_r2759600288
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/SparkBasicSchemaEvolution.scala:
##########
@@ -20,32 +20,118 @@
package org.apache.spark.sql.execution.datasources.parquet
import org.apache.hudi.SparkAdapterSupport.sparkAdapter
-
+import org.apache.hudi.common.model.HoodieFileFormat
import org.apache.spark.sql.HoodieSchemaUtils
import org.apache.spark.sql.catalyst.expressions.UnsafeProjection
-import org.apache.spark.sql.types.StructType
+import org.apache.spark.sql.execution.datasources.SparkSchemaTransformUtils
+import org.apache.spark.sql.types.{ArrayType, DataType, MapType, StructField,
StructType}
/**
- * Intended to be used just with HoodieSparkParquetReader to avoid any
java/scala issues
+ * Generic schema evolution handler for different file formats.
+ * Supports Parquet (default), and Lance currently.
Review Comment:
I will use my same response and you let me know where we are lacking
clarity. My preference is to keep these separate and make it composable. Then
future file formats can also leverage these same components as needed. If that
is not possible with spark, then just let me know.
Is it possible to chain projections or is it simply not possible?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]