Ma Jian created HUDI-7069:
-----------------------------
Summary: Optimize metaclient construction and include table config
in write config for multi-table services.
Key: HUDI-7069
URL: https://issues.apache.org/jira/browse/HUDI-7069
Project: Apache Hudi
Issue Type: Bug
Reporter: Ma Jian
In the current implementation of run multi tables services, the clustering task
and compaction task both build metaclient repeatedly for each table, causing
additional overhead. To reduce this overhead, we extract the construction of
metaclient and only construct it once for each table, passing it as a parameter
to the corresponding task.
At the same time, when running multi tables services, the write config lacks
some information from the table config, such as the table name. This leads to
empty strings when retrieving the table name in certain situations. For
example, when configuring the prefix for metrics, if not specified, the table
name is used as the prefix. However, in the current situation, without the
table config, it's impossible to differentiate the metrics of different tables,
resulting in an empty prefix. By adding the table config to the write config
beforehand, we can obtain all the configuration information in the subsequent
write config step.
Additionally, we made a small modification by removing the redundant
construction of metaclient in the clusteringJob's constructor.
!https://intranetproxy.alipay.com/skylark/lark/0/2023/png/62256341/1699595853317-03416d85-8a25-4fbd-96af-f351f0ac6ec7.png?x-oss-process=image%2Fresize%2Cw_1322%2Climit_0|width=470,height=355!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)