TheR1sing3un commented on code in PR #14161:
URL: https://github.com/apache/hudi/pull/14161#discussion_r2587242503


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedFileFormat.scala:
##########
@@ -202,6 +214,8 @@ class HoodieFileGroupReaderBasedFileFormat(tablePath: 
String,
     val requestedAvroSchema = AvroSchemaUtils.pruneDataSchema(avroTableSchema, 
AvroConversionUtils.convertStructTypeToAvroSchema(requestedSchema, 
sanitizedTableName), exclusionFields)
     val dataAvroSchema = AvroSchemaUtils.pruneDataSchema(avroTableSchema, 
AvroConversionUtils.convertStructTypeToAvroSchema(dataSchema, 
sanitizedTableName), exclusionFields)
 
+    
spark.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader",
 supportVectorizedRead.toString)

Review Comment:
   A friendly reminder: 
   If we modify this configuration in the conf of spark sessionState in the 
hudi logic, it may disrupt the read logic of other datasources.
   For example, if this configuration is initially set to true, When a spark 
sql reads a hudi table and another datasource table such as a hive table, the 
behavior we hope for is that whether the hudi performs vectorized reading is 
controlled by the hudi logic itself, while hive directly performs vectorized 
reading.
   However, if we change this configuration here, perhaps this will lead to 
hive not performing vectorized reading.
   A similar issue was encountered in the previous `BaseRelation.java`.
   cc @jonvex @yihua 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to