KnightChess opened a new pull request, #11470: URL: https://github.com/apache/hudi/pull/11470
### Change Logs as https://github.com/apache/hudi/issues/11274 and https://github.com/apache/hudi/pull/11463 describe, there has two case question. - if the rdd is input rdd without shuffle, the partitiion number is too bigger or too small - user need can not control it easy - in some case user can set `spark.default.parallelism` change it. - in some case user can not change because hard-code - and in spark, the better way is use `spark.default.parallelism` or `spark.sql.shuffle.partitions` can control it, other is advanced in hudi. ### Impact like dedup where use new deduce logical, user can use `spark.sql.shuffle.partitions` or `spark.default.parallelism` control the parallelism. For special scenes, also can use advanced params. ### Risk level (write none, low medium or high below) low ### Documentation Update None ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
