[I] [SUPPORT] How to configure spark and flink to write mor tables using bucket indexes? [hudi]

via GitHub Sat, 14 Sep 2024 08:22:25 -0700


xiaofan2022 opened a new issue, #11946:
URL: https://github.com/apache/hudi/issues/11946


   I want to use flink and spark to write to the mor table, and use bucket 
CONSISTENT_HASHING for the index, but I find that spark is very fast to write 
the full amount and flink is very slow(flink write 100record/s) to write 
increments. 
   spark sql：
   ```
   CREATE TABLE test.tableA ()
   USING hudi
   TBLPROPERTIES (
   'connector' = 'hudi',
   'index.type'='BUCKET',
   'hoodie.index.type'='BUCKET',
   'hoodie.index.bucket.engine'='CONSISTENT_HASHING',
   'hoodie.datasource.write.recordkey.field' = '',
   'path' = '',
   'preCombineField' = 'create_time',
   'precombine.field' = 'create_time',
   'primaryKey' = '',
   'table.type' = 'MERGE_ON_READ',
   'write.rate.limit'='10000',--flink配置
   'write.tasks'='2',--flink配置
   'write.utc-timezone'='false',
    'type' = 'mor');
   ``` 
   <img width="582" alt="flink_slow" 
src="https://github.com/user-attachments/assets/1ff17ad2-1192-44d7-9d56-7c846d52603b";>
 How to optimize？
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [SUPPORT] How to configure spark and flink to write mor tables using bucket indexes? [hudi]

Reply via email to