Re: [PR] [HUDI-7915] Spark 4 support [hudi]

via GitHub Mon, 12 May 2025 02:09:51 -0700


wombatu-kun commented on code in PR #12772:
URL: https://github.com/apache/hudi/pull/12772#discussion_r2084204716



##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkAdapter.scala:
##########
@@ -250,4 +260,52 @@ trait SparkAdapter extends Serializable {
    * @return
    */
   def stopSparkContext(jssc: JavaSparkContext, exitCode: Int): Unit
+
+  def getSchema(conn: Connection,
+                resultSet: ResultSet,
+                dialect: JdbcDialect,
+                alwaysNullable: Boolean = false,
+                isTimestampNTZ: Boolean = false): StructType
+
+  /**
+   * [SPARK-46832] Using UTF8String compareTo directly would throw 
UnsupportedOperationException since Spark 4.0
+   */
+  def compareUTF8String(a: UTF8String, b: UTF8String): Int = a.compareTo(b)
+
+  /**
+   * [SPARK-46832] Using UTF8String compareTo directly would throw 
UnsupportedOperationException since Spark 4.0
+   * FlatLists is a static class and we cannot override any methods within to 
change the logic for comparison
+   * So we have to create [[Spark4FlatLists]] for Spark 4.0+
+   */
+  def createComparableList(t: Array[AnyRef]): 
FlatLists.ComparableList[Comparable[HoodieRecord[_]]] = 
FlatLists.ofComparableArray(t)
+
+  def createInternalRow(commitTime: UTF8String,
+                        commitSeqNumber: UTF8String,
+                        recordKey: UTF8String,
+                        partitionPath: UTF8String,
+                        fileName: UTF8String,
+                        sourceRow: InternalRow,
+                        sourceContainsMetaFields: Boolean): HoodieInternalRow
+
+  def createInternalRow(metaFields: Array[UTF8String],
+                        sourceRow: InternalRow,
+                        sourceContainsMetaFields: Boolean): HoodieInternalRow
+
+  def createHoodiePartitionCDCFileGroupMapping(partitionValues: InternalRow,
+                                               fileSplits: 
List[HoodieCDCFileSplit]): HoodiePartitionCDCFileGroupMapping
+
+  def createHoodiePartitionFileSliceMapping(values: InternalRow,
+                                            slices: Map[String, FileSlice]): 
HoodiePartitionFileSliceMapping
+
+  def newParseException(command: Option[String],
+                        exception: AnalysisException,
+                        start: Origin,
+                        stop: Origin): ParseException
+
+  def compareValues[T <% Comparable[T]](a: T, b: T): Int = a.compareTo(b)
+
+  def splitFiles(sparkSession: SparkSession,
+                 partitionDirectory: PartitionDirectory,
+                 isSplitable: Boolean,
+                 maxSplitSize: Long): Seq[PartitionedFile]

Review Comment:
   And what should we do with that tests?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-7915] Spark 4 support [hudi]

Reply via email to