(hudi) branch master updated: [HUDI-7068] Disable vectorized reader for hoodie filegroup reader when schema isn't supported (#10043)

yihua Thu, 09 Nov 2023 22:47:33 -0800

This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/master by this push:
     new 03d50606617 [HUDI-7068] Disable vectorized reader for hoodie filegroup 
reader when schema isn't supported (#10043)
03d50606617 is described below

commit 03d50606617bc6819eab77b4f968e7d0dbee6159
Author: Jon Vexler <[email protected]>
AuthorDate: Fri Nov 10 01:47:19 2023 -0500

    [HUDI-7068] Disable vectorized reader for hoodie filegroup reader when 
schema isn't supported (#10043)
    
    Co-authored-by: Jonathan Vexler <=>
---
 .../parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala            | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala
 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala
index 01e08d7cb0f..cee66336ec9 100644
--- 
a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala
+++ 
b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala
@@ -89,6 +89,7 @@ class HoodieFileGroupReaderBasedParquetFileFormat(tableState: 
HoodieTableState,
                                               options: Map[String, String],
                                               hadoopConf: Configuration): 
PartitionedFile => Iterator[InternalRow] = {
     val outputSchema = StructType(requiredSchema.fields ++ 
partitionSchema.fields)
+    spark.conf.set("spark.sql.parquet.enableVectorizedReader", 
supportBatchResult)
     val requiredSchemaWithMandatory = 
generateRequiredSchemaWithMandatory(requiredSchema, dataSchema, partitionSchema)
     val requiredSchemaSplits = requiredSchemaWithMandatory.fields.partition(f 
=> HoodieRecord.HOODIE_META_COLUMNS_WITH_OPERATION.contains(f.name))
     val requiredMeta = StructType(requiredSchemaSplits._1)

(hudi) branch master updated: [HUDI-7068] Disable vectorized reader for hoodie filegroup reader when schema isn't supported (#10043)

Reply via email to