voonhous commented on PR #9625: URL: https://github.com/apache/hudi/pull/9625#issuecomment-1720387188
@yihua @yihua Looked through the CI failures, they seem to be errors when trying to invoke the RowWriter implementation when performing clustering. ```log [ERROR] Errors: [ERROR] TestHoodieBackedMetadata.testClusterOperationOnMainTable()(TestHoodieBackedMetadata) [ERROR] Run 1: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter [ERROR] Run 2: java.util.concurrent.CancellationException [ERROR] Run 3: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter [ERROR] Run 4: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter [INFO] [ERROR] TestHoodieBackedMetadata.testMDTCompactionWithFailedCommits()(TestHoodieBackedMetadata) [ERROR] Run 1: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter [ERROR] Run 2: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter [ERROR] Run 3: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter [ERROR] Run 4: java.lang.ClassNotFoundException: org.apache.spark.sql.adapter.Spark3_2Adapter ``` Prior to this change, the existing tests are using the RDD implementation. But due to the mismatch in configs, the RowWriter implementation was not really tested for all tests invoking clustering. Since this is a "[MINOR]" PR fix, i will add configs in the affected tests to ensure that they use the RDD implementation. We can create another PR to increase the coverage of the clustering writers after this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
