[
https://issues.apache.org/jira/browse/HUDI-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Kudinkin reassigned HUDI-5261:
-------------------------------------
Assignee: Jonathan Vexler
> Use proper parallelism for engine context APIs
> ----------------------------------------------
>
> Key: HUDI-5261
> URL: https://issues.apache.org/jira/browse/HUDI-5261
> Project: Apache Hudi
> Issue Type: Improvement
> Components: performance
> Reporter: Raymond Xu
> Assignee: Jonathan Vexler
> Priority: Critical
> Fix For: 0.12.2
>
>
> do a global search of these APIs
> - org.apache.hudi.common.engine.HoodieEngineContext#flatMap
> - org.apache.hudi.common.engine.HoodieEngineContext#map
> and similar ones take in parallelism.
> A lot of occurrences are using number of items as parallelism, which affect
> performance. Parallelism should be based on num cores available in the
> cluster and set by user via parallelism configs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)