Ma Jian created HUDI-7069:
-----------------------------

             Summary: Optimize metaclient construction and include table config 
in write config for multi-table services.
                 Key: HUDI-7069
                 URL: https://issues.apache.org/jira/browse/HUDI-7069
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Ma Jian


In the current implementation of run multi tables services, the clustering task 
and compaction task both build metaclient repeatedly for each table, causing 
additional overhead. To reduce this overhead, we extract the construction of 
metaclient and only construct it once for each table, passing it as a parameter 
to the corresponding task.

At the same time, when running multi tables services, the write config lacks 
some information from the table config, such as the table name. This leads to 
empty strings when retrieving the table name in certain situations. For 
example, when configuring the prefix for metrics, if not specified, the table 
name is used as the prefix. However, in the current situation, without the 
table config, it's impossible to differentiate the metrics of different tables, 
resulting in an empty prefix. By adding the table config to the write config 
beforehand, we can obtain all the configuration information in the subsequent 
write config step.

Additionally, we made a small modification by removing the redundant 
construction of metaclient in the clusteringJob's constructor.

!https://intranetproxy.alipay.com/skylark/lark/0/2023/png/62256341/1699595853317-03416d85-8a25-4fbd-96af-f351f0ac6ec7.png?x-oss-process=image%2Fresize%2Cw_1322%2Climit_0|width=470,height=355!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to