wecharyu commented on code in PR #9889:
URL: https://github.com/apache/hudi/pull/9889#discussion_r1371097371
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BaseFileOnlyRelation.scala:
##########
@@ -149,27 +152,10 @@ case class BaseFileOnlyRelation(override val sqlContext:
SQLContext,
val enableFileIndex = HoodieSparkConfUtils.getConfigValue(optParams,
sparkSession.sessionState.conf,
ENABLE_HOODIE_FILE_INDEX.key,
ENABLE_HOODIE_FILE_INDEX.defaultValue.toString).toBoolean
if (enableFileIndex && globPaths.isEmpty) {
- // NOTE: There are currently 2 ways partition values could be fetched:
- // - Source columns (producing the values used for physical
partitioning) will be read
- // from the data file
- // - Values parsed from the actual partition path would be
appended to the final dataset
- //
- // In the former case, we don't need to provide the
partition-schema to the relation,
- // therefore we simply stub it w/ empty schema and use full
table-schema as the one being
- // read from the data file.
Review Comment:
Got your point. The change here is because baseRelation will be converted to
HadoopFsRelation only when `baseRelation.hasSchemaOnRead` is **false**:
https://github.com/apache/hudi/blob/65dd645b487a61fbca7e4e4b849d1f2f1ec143f9/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala#L328-L332
In this case `shouldExtractPartitionValuesFromPartitionPath` is true, this
is just a code simplify.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]