pengxianzi opened a new issue, #12589:
URL: https://github.com/apache/hudi/issues/12589
Problem Description
We encountered the following issues while using Apache Hudi for data
migration and real-time writing:
Scenario 1:
Migrating data from Kudu to a Hudi MOR bucketed table, then writing data
from MySQL via Kafka to the Hudi MOR bucketed table works fine.
Scenario 2:
Migrating data from Kudu to a Hudi COW bucketed table, then writing data
from MySQL via Kafka to the Hudi COW bucketed table fails to generate commits,
and the checkpoint fails.
Error Log
Here is the error log when the task fails:
org.apache.flink.streaming.runtime.tasks.AsynccheckpointRunnable [].
bucket_write default_databases.loan_withdraw_order_new_cow ardemmewlcos part of
checkpoint 1 could not be completed.
java.util.concurrent.cancellationException:null
org.apache.flink.runtime.checkpoint.CheckpointException: Could not complete
snapshot 2 for operator bucket_write:
default_database.lom_withdraw_order_new_con (3/4). failure reason: Checkpoint
was declined.
Caused by :org.apache.hudi.exception.HoodieException:Timeout(1201000ms)
while waiting for instant initialize
switched from RUNNING to FAILED with failure cause: jave.io.I0Exception:
could not perforn checkpoint 2 for operator bucket_write:
default_database.loan_withdraw_order_new_cow(3/4)#0.
Configuration Parameters
Here is our Hudi table configuration:
options.put("hoodie.upsert.shuffle.parallelism", "20");
options.put("hoodie.insert.shuffle.parallelism", "20");
options.put("write.operation", "upsert");
options.put(FlinkOptions.TABLE_TYPE.key(), name);
options.put(FlinkOptions.PRECOMBINE_FIELD.key(),precombing);
options.put(FlinkOptions.PRE_COMBINE.key(), "true");
options.put("hoodie.clean.automatic", "true");
options.put("hoodie.cleaner.policy", "KEEP_LATEST_COMMITS");
options.put("hoodie.cleaner.commits.retained", "5");
options.put("hoodie.clean.async", "true");
options.put("hoodie.archive.min.commits", "20");
options.put("hoodie.archive.max.commits", "30");
options.put("hoodie.clean.parallelism", "20");
options.put("hoodie.archive.parallelism", "20");
options.put("hoodie.write.concurrency.mode","optimistic_concurrency_control");
options.put("write.tasks", "20");
options.put("index.type","BUCKET");
options.put("hoodie.bucket.index.num.buckets","80");
options.put("hoodie.index.bucket.engine","SIMPLE");
Checkpoint Configuration
We tested various checkpoint timeout and interval configurations, but the
issue persists:
env.getCheckpointConfig().setCheckpointTimeout(5*60*1000L);
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(60*1000L);
Steps to Reproduce
Migrate data from Kudu to a Hudi COW bucketed table.
Write data from MySQL via Kafka to the Hudi COW bucketed table.
The task fails with the error Timeout while waiting for instant initialize.
Expected Behavior
The task should generate commits normally, and the checkpoint should succeed.
Actual Behavior
The task fails, no commits are generated, and the checkpoint fails with an
error.
Hudi version : 0.14.0
Spark version : 2.4.7
Hive version : 3.1.3
Hadoop version : 3.1.1
Further Questions
Checkpoint Timeout Issue:
The error log mentions Timeout while waiting for instant initialize. Is this
related to the initialization mechanism of the Hudi COW table? Are there ways
to optimize the initialization time?
COW Table Write Performance:
Is the write performance of COW tables slower than MOR tables? Are there
optimization suggestions for COW tables?
Impact of Bucketed Table:
Does the bucketed table have a specific impact on the write performance of
COW tables? Are there optimization configurations for bucketed tables?
Checkpoint Configuration:
We tried various checkpoint timeout and interval configurations, but the
issue persists. Are there recommended checkpoint configurations?
Summary
We would like to know:
Why does the Hudi COW bucketed table encounter checkpoint timeout issues
during writes?
Are there optimization suggestions for COW table write performance?
Does the bucketed table have a specific impact on COW table write
performance?
Are there recommended checkpoint configurations?
Thank you for your help!

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]