alexeykudinkin commented on a change in pull request #5168:
URL: https://github.com/apache/hudi/pull/5168#discussion_r837758932
##########
File path:
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -290,11 +290,8 @@ abstract class HoodieBaseRelation(val sqlContext:
SQLContext,
)
}
- private def imbueConfigs(sqlContext: SQLContext): Unit = {
-
sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown",
"true")
-
sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled",
"true")
- // TODO(HUDI-3639) vectorized reader has to be disabled to make sure
MORIncrementalRelation is working properly
-
sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader",
"false")
+ protected def imbueConfigs(sqlContext: SQLContext): Unit = {
+ // Nothing to do
Review comment:
Why removing record-level filtering and filter push-down?
I know they are not going to work with vectorized reader, but vectorized
reader is only applicable for primitive types, while filters could be pushed
down for any payload.
##########
File path:
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala
##########
@@ -290,11 +290,8 @@ abstract class HoodieBaseRelation(val sqlContext:
SQLContext,
)
}
- private def imbueConfigs(sqlContext: SQLContext): Unit = {
-
sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.filterPushdown",
"true")
-
sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.recordLevelFilter.enabled",
"true")
- // TODO(HUDI-3639) vectorized reader has to be disabled to make sure
MORIncrementalRelation is working properly
-
sqlContext.sparkSession.sessionState.conf.setConfString("spark.sql.parquet.enableVectorizedReader",
"false")
+ protected def imbueConfigs(sqlContext: SQLContext): Unit = {
+ // Nothing to do
Review comment:
Also, i would rather keep all the settings in here just to make them
explicit contract of what each relation is requiring
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]