yihua commented on code in PR #8885:
URL: https://github.com/apache/hudi/pull/8885#discussion_r1224611690
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BaseFileOnlyRelation.scala:
##########
@@ -66,17 +66,21 @@ case class BaseFileOnlyRelation(override val sqlContext:
SQLContext,
// NOTE: This override has to mirror semantic of whenever this Relation is
converted into [[HadoopFsRelation]],
// which is currently done for all cases, except when Schema Evolution
is enabled
override protected val shouldExtractPartitionValuesFromPartitionPath:
Boolean =
- internalSchemaOpt.isEmpty
+ internalSchemaOpt.isEmpty
override lazy val mandatoryFields: Seq[String] = Seq.empty
+ // Before Spark 3.4.0: PartitioningAwareFileIndex.BASE_PATH_PARAM
+ // Since Spark 3.4.0: FileIndexOptions.BASE_PATH_PARAM
+ val BASE_PATH_PARAM = "basePath"
+
override def updatePrunedDataSchema(prunedSchema: StructType): Relation =
this.copy(prunedDataSchema = Some(prunedSchema))
override def imbueConfigs(sqlContext: SQLContext): Unit = {
super.imbueConfigs(sqlContext)
// TODO Issue with setting this to true in spark 332
- if (!HoodieSparkUtils.gteqSpark3_3_2) {
+ if (HoodieSparkUtils.gteqSpark3_4 || !HoodieSparkUtils.gteqSpark3_3_2) {
Review Comment:
```
Merge Hudi to Hudi *** FAILED ***
2023-06-06T23:38:24.7660935Z org.apache.spark.SparkException: Job aborted
due to stage failure: Task 0 in stage 3194.0 failed 1 times, most recent
failure: Lost task 0.0 in stage 3194.0 (TID 3768) (fv-az1128-658 executor
driver): java.lang.ClassCastException:
org.apache.spark.sql.vectorized.ColumnarBatchRow cannot be cast to
org.apache.spark.sql.vectorized.ColumnarBatch
2023-06-06T23:38:24.7662056Z at
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.next(DataSourceScanExec.scala:560)
2023-06-06T23:38:24.7662628Z at
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.next(DataSourceScanExec.scala:549)
2023-06-06T23:38:24.7663391Z at
scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]