sivabalan narayanan created HUDI-5012:
-----------------------------------------
Summary: Fix clean planning for very large partitions
Key: HUDI-5012
URL: https://issues.apache.org/jira/browse/HUDI-5012
Project: Apache Hudi
Issue Type: Improvement
Components: cleaning
Reporter: sivabalan narayanan
Within clean planning phase, we do a map() for every partition and then trigger
planning for each partition within that.
For very large number of partitions, and if cleaner shuffle parallelism is
small, this results in more sequential planning. We can enhance this with
mapPartitions call and optimize it
--
This message was sent by Atlassian Jira
(v8.20.10#820010)