pengxianzi opened a new issue, #12589:
URL: https://github.com/apache/hudi/issues/12589

   Problem Description
   We encountered the following issues while using Apache Hudi for data 
migration and real-time writing:
   
   Scenario 1:
   
   Migrating data from Kudu to a Hudi MOR bucketed table, then writing data 
from MySQL via Kafka to the Hudi MOR bucketed table works fine.
   
   Scenario 2:
   
   Migrating data from Kudu to a Hudi COW bucketed table, then writing data 
from MySQL via Kafka to the Hudi COW bucketed table fails to generate commits, 
and the checkpoint fails.
   
   Error Log
   Here is the error log when the task fails:
   
   org.apache.flink.streaming.runtime.tasks.AsynccheckpointRunnable []. 
bucket_write default_databases.loan_withdraw_order_new_cow ardemmewlcos part of 
checkpoint 1 could not be completed.
   
   java.util.concurrent.cancellationException:null
   
   org.apache.flink.runtime.checkpoint.CheckpointException: Could not complete 
snapshot 2 for operator bucket_write: 
default_database.lom_withdraw_order_new_con (3/4). failure reason: Checkpoint 
was declined.
   
   Caused by :org.apache.hudi.exception.HoodieException:Timeout(1201000ms) 
while waiting for instant initialize
   
   switched from RUNNING to FAILED with failure cause: jave.io.I0Exception: 
could not perforn checkpoint 2 for  operator bucket_write: 
default_database.loan_withdraw_order_new_cow(3/4)#0.
   
   
   Configuration Parameters
   Here is our Hudi table configuration:
   
   options.put("hoodie.upsert.shuffle.parallelism", "20");
   options.put("hoodie.insert.shuffle.parallelism", "20"); 
   options.put("write.operation", "upsert");
   options.put(FlinkOptions.TABLE_TYPE.key(), name);
   options.put(FlinkOptions.PRECOMBINE_FIELD.key(),precombing);
   options.put(FlinkOptions.PRE_COMBINE.key(), "true");
   options.put("hoodie.clean.automatic", "true");
   options.put("hoodie.cleaner.policy", "KEEP_LATEST_COMMITS");
   options.put("hoodie.cleaner.commits.retained", "5");
   options.put("hoodie.clean.async", "true");
   options.put("hoodie.archive.min.commits", "20");
   options.put("hoodie.archive.max.commits", "30");
   options.put("hoodie.clean.parallelism", "20");
   options.put("hoodie.archive.parallelism", "20");
   
options.put("hoodie.write.concurrency.mode","optimistic_concurrency_control");
   options.put("write.tasks", "20");    
   options.put("index.type","BUCKET");
   options.put("hoodie.bucket.index.num.buckets","80");
   options.put("hoodie.index.bucket.engine","SIMPLE");
   
   
   
   Checkpoint Configuration
   We tested various checkpoint timeout and interval configurations, but the 
issue persists:
   
   env.getCheckpointConfig().setCheckpointTimeout(5*60*1000L);
   env.getCheckpointConfig().setMinPauseBetweenCheckpoints(60*1000L);
   
   
   Steps to Reproduce
   Migrate data from Kudu to a Hudi COW bucketed table.
   
   Write data from MySQL via Kafka to the Hudi COW bucketed table.
   
   The task fails with the error Timeout while waiting for instant initialize.
   
   Expected Behavior
   The task should generate commits normally, and the checkpoint should succeed.
   
   Actual Behavior
   The task fails, no commits are generated, and the checkpoint fails with an 
error.
   
   
   
   Hudi version : 0.14.0
   
   Spark version : 2.4.7
   
   Hive version : 3.1.3
   
   Hadoop version : 3.1.1
   Further Questions
   Checkpoint Timeout Issue:
   The error log mentions Timeout while waiting for instant initialize. Is this 
related to the initialization mechanism of the Hudi COW table? Are there ways 
to optimize the initialization time?
   
   COW Table Write Performance:
   Is the write performance of COW tables slower than MOR tables? Are there 
optimization suggestions for COW tables?
   
   Impact of Bucketed Table:
   Does the bucketed table have a specific impact on the write performance of 
COW tables? Are there optimization configurations for bucketed tables?
   
   Checkpoint Configuration:
   We tried various checkpoint timeout and interval configurations, but the 
issue persists. Are there recommended checkpoint configurations?
   
   Summary
   We would like to know:
   
   Why does the Hudi COW bucketed table encounter checkpoint timeout issues 
during writes?
   
   Are there optimization suggestions for COW table write performance?
   
   Does the bucketed table have a specific impact on COW table write 
performance?
   
   Are there recommended checkpoint configurations?
   
   Thank you for your help!
   
   
![image](https://github.com/user-attachments/assets/d499c06d-3245-41b7-b2e2-512433eb9208)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to