[
https://issues.apache.org/jira/browse/HUDI-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523613#comment-17523613
]
Chuang Lee commented on HUDI-3898:
----------------------------------
I'm not sure if I understood your comment correctly:
If the user chooses to synchronize only the (_rt/_ro) table and skip the suffix
of the (_rt/_ro) table through the configuration item, it means that the user
only needs (_rt/_ro) one table and no suffix. For example, through the
following The configuration item means that the user only wants to synchronize
the _ro table without suffix: 'hive_sync.table_type_select'='ro',
'hive_sync.skip_ro_suffix'='true'. Once the user configures these optional
options, it means that the user knows whether the table without the suffix is
a _ro or _rt table.
Of course, there is an extreme case that needs to be handled: if the user
synchronizes the _ro and _rt tables at the same time, but both tables skip the
suffix, this should be an abnormal situation, and I think the following way can
be used to handle this situation :
1. An exception is thrown at startup indicating that the suffixes of the _ro
and _rt tables cannot be skipped at the same time
2. When hive metadata is synchronized, the error log is printed: it indicates
that the suffixes of the _ro and _rt tables cannot be skipped at the same time,
but the task is not interrupted, and the two suffix skip options do not take
effect.
Do you have any good suggestions for dealing with this extreme situation?
thank you for your reply.[~danny0405]
> Mor table hive synchronization supports _ro or _rt table type selection and
> _rt table suffix skip configuration items
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HUDI-3898
> URL: https://issues.apache.org/jira/browse/HUDI-3898
> Project: Apache Hudi
> Issue Type: Improvement
> Components: hive, meta-sync
> Reporter: Chuang Lee
> Priority: Major
> Fix For: 0.11.0
>
>
> Related links :https://github.com/apache/hudi/issues/5327
> # The current mor table hive synchronization only supports the suffix
> skipping of the _ro table. Can it also support the suffix skipping
> configuration item of the _rt table? Because for some business scenarios,
> especially non-partitioned tables, in order to be consistent with the table
> names of existing businesses, some businesses only need the _ro table to skip
> the suffix, and some businesses only need the _rt table to skip the suffix,
> so do you consider adding configuration items? _rt table suffix.
> # The current mor table hive synchronization synchronizes the _ro table and
> the _rt table at the same time, which cannot be changed, but for some
> non-partitioned business scenarios, some only need to synchronize the _ro
> table or the _rt table to meet the requirements. If all are synchronized, it
> will bring extra cost, so consider adding configuration items to be more
> flexible to choose.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)