yihua commented on code in PR #10304:
URL: https://github.com/apache/hudi/pull/10304#discussion_r1424802397
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala:
##########
@@ -235,15 +236,11 @@ object DefaultSource {
Option(schema)
}
- val useNewParquetFileFormat =
- parameters.getOrElse(
- USE_NEW_HUDI_PARQUET_FILE_FORMAT.key,
- USE_NEW_HUDI_PARQUET_FILE_FORMAT.defaultValue).toBoolean &&
- !metaClient.isMetadataTable &&
- (globPaths == null || globPaths.isEmpty) &&
- parameters.getOrElse(REALTIME_MERGE.key(), REALTIME_MERGE.defaultValue())
- .equalsIgnoreCase(REALTIME_PAYLOAD_COMBINE_OPT_VAL)
-
+ val useNewParquetFileFormat =
parameters.getOrDefault(HoodieReaderConfig.FILE_GROUP_READER_ENABLED.key(),
Review Comment:
Looks like `FILE_GROUP_READER_ENABLED` controls whether the
`HadoopFsRelation` with new file format is used, which is kind of in a weird
state. Should we keep `hoodie.data source.read.use.new.parquet.file.format`
and when it's turned on, the new file group reader is always used for Spark?
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala:
##########
@@ -86,15 +86,6 @@ object DataSourceReadOptions {
s"payload implementation to merge (${REALTIME_PAYLOAD_COMBINE_OPT_VAL})
or skip merging altogether" +
s"${REALTIME_SKIP_MERGE_OPT_VAL}")
- val USE_NEW_HUDI_PARQUET_FILE_FORMAT: ConfigProperty[String] = ConfigProperty
Review Comment:
Do we still have the config to fall back to existing relations for reading
Hudi tables in Spark?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]