[GitHub] [hudi] yihua commented on issue #7443: [SUPPORT] How to reduce the bucket number for each partition

GitBox Thu, 22 Dec 2022 12:50:49 -0800


yihua commented on issue #7443:
URL: https://github.com/apache/hudi/issues/7443#issuecomment-1363329105


   Hi @JoshuaZhuCN Thanks for raising this question.  Could you provide 
clarification on what Hudi write operation you're concerning?  How does the 
Spark bucketing come into the picture?  In general, you can [tune the 
parallelism](https://hudi.apache.org/docs/faq#how-to-tune-shuffle-parallelism-of-hudi-jobs-)
 in Hudi (check out the link to the FAQ) to control the number of Spark tasks 
and partitions and thus the number of files and the file size.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] yihua commented on issue #7443: [SUPPORT] How to reduce the bucket number for each partition

Reply via email to