Re: [PR] [HUDI-9302] Enable vectorized reading for file slice without log file [hudi]

via GitHub Thu, 22 May 2025 20:20:47 -0700


TheR1sing3un commented on code in PR #13127:
URL: https://github.com/apache/hudi/pull/13127#discussion_r2103715541



##########
hudi-spark-datasource/hudi-spark3.3.x/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/Spark33ParquetReader.scala:
##########
@@ -252,6 +252,17 @@ object Spark33ParquetReader extends 
SparkParquetReaderBuilder {
       sqlConf.getConfString("spark.sql.legacy.parquet.nanosAsLong", 
"false").toBoolean
     )
 
+    // Should always be set by FileSourceScanExec creating this.
+    // Check conf before checking option, to allow working around an issue by 
changing conf.
+    val returningBatch = sqlConf.parquetVectorizedReaderEnabled &&

Review Comment:
   > should we fix the other version of parquet readers?
   
   Only fix spark 3.3 is fine, because the relevant changes of spark were 
integrated in versions after 3.3, only 3.3 needs to be compatible, refer to 
https://github.com/apache/spark/pull/38397



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-9302] Enable vectorized reading for file slice without log file [hudi]

Reply via email to