[GitHub] [hudi] voonhous commented on pull request #9625: [MINOR] Fix default config values if not specified in MultipleSparkJobExecutionStrategy

via GitHub Thu, 14 Sep 2023 19:09:55 -0700


voonhous commented on PR #9625:
URL: https://github.com/apache/hudi/pull/9625#issuecomment-1720387188


   @yihua @yihua Looked through the CI failures, they seem to be errors when 
trying to invoke the RowWriter implementation when performing clustering.
   
   ```log
   [ERROR] Errors: 
   [ERROR] 
TestHoodieBackedMetadata.testClusterOperationOnMainTable()(TestHoodieBackedMetadata)
   [ERROR]   Run 1: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   [ERROR]   Run 2: java.util.concurrent.CancellationException
   [ERROR]   Run 3: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   [ERROR]   Run 4: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   [INFO] 
   [ERROR] 
TestHoodieBackedMetadata.testMDTCompactionWithFailedCommits()(TestHoodieBackedMetadata)
   [ERROR]   Run 1: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   [ERROR]   Run 2: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   [ERROR]   Run 3: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   [ERROR]   Run 4: java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_2Adapter
   ```
   
   Prior to this change, the existing tests are using the RDD implementation. 
But due to the mismatch in configs, the RowWriter implementation was not really 
tested for all tests invoking clustering.
   
   Since this is a "[MINOR]" PR fix, i will add configs in the affected tests 
to ensure that they use the RDD implementation. 
   
   We can create another PR to increase the coverage of the clustering writers 
after this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] voonhous commented on pull request #9625: [MINOR] Fix default config values if not specified in MultipleSparkJobExecutionStrategy

Reply via email to