[
https://issues.apache.org/jira/browse/HIVE-21764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
mahesh kumar behera updated HIVE-21764:
---------------------------------------
Description:
REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular
expression + inclusion/exclusion list. So, in case of rename table event, the
event will be ignored if old table doesn't match the pattern but the new table
should be bootstrapped. REPL DUMP should have a mechanism to detect such tables
and automatically bootstrap with incremental replication.Also, if renamed table
is excluded from replication policy, then need to drop the old table at target
as well.
There are 4 scenarios that needs to be handled.
# Both new name and old name satisfies the table name pattern filter.
* No need to do any thing. The incremental event for rename should take care
of the replication.
# Both the names does not satisfy the table name pattern filter.
* Both the names are not in the scope of the policy and this nothing needs to
be done.
# New name satisfies the pattern but the old name does not.
* The table will not be present at the target.
* Rename event handler for dump should detect this case and add the new table
name to the list of table for bootstrap.
* All the events related to the table (new name) should be ignored.
* If there is a drop event for the table (with new name), then remove the
table from the the list of table to be bootstrapped.
* In case of rename (double rename)
* If the new name satisfies the table pattern, then add the new name to the
list of tables to be bootstrapped and remove the old name from the list of
tables to be bootstrapped.
* If the new name does not satisfies then just removed the table name from the
list of tables to be bootstrapped.
# New name does not satisfies the pattern but the old name satisfies.
* Change the rename event to a drop event.
was:
* REPL DUMP takes 2 inputs along with existing FROM and WITH clause.
{code:java}
- REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM
<last_repl_id> WITH <key_values_list>;
- current_repl_policy and previous_repl_policy can be any format mentioned in
Point-4.
- REPLACE clause to be supported to take previous repl policy as input. If
REPLACE clause is not there, then the policy remains unchanged.
- Rest of the format remains same.{code}
* Now, REPL DUMP on this DB will replicate the tables based on
current_repl_policy.
* Currently single table replication of format <db_name>.t1 is not supported
for table level replication. So it will be not be supported in replace clause
also.
* If any table is added dynamically either due to change in regular expression
or added to include list should be bootstrapped using independent table level
replication policy.
{code:java}
- Hive will automatically figure out the list of tables newly included in the
list by comparing the current_repl_policy & previous_repl_policy inputs and
combine bootstrap dump for added tables as part of incremental dump.
"_bootstrap" directory can be created in dump dir to accommodate all tables to
be bootstrapped.
- If any table is renamed, then it may gets dynamically added/removed for
replication based on defined replication policy + include/exclude list. So,
Hive will perform bootstrap for the table which is just included after rename.
- Tables added after the previous policy run and before replace policy, will be
replicated using bootstrap if the table name satisfies inclusion in both the
policy. The events generated for those tables will be ignored while dumping the
events.{code}
* REPL LOAD should check for changes in REPL policy and drop the tables/views
excluded in the new policy compared to previous policy. It should be done
before performing incremental and bootstrap load from the current dump. Both
the policy will be stored in _bootstrap directory and will be used during REPL
load to drop the redundant tables.
* REPL LOAD on incremental dump should load events directories first and then
check for "_bootstrap" directory and perform bootstrap load on them.
Rename table is not in scope of this Jira.
> REPL DUMP should detect and bootstrap any rename table events where old table
> was excluded but renamed table is included.
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-21764
> URL: https://issues.apache.org/jira/browse/HIVE-21764
> Project: Hive
> Issue Type: Sub-task
> Components: repl
> Reporter: Sankar Hariappan
> Assignee: mahesh kumar behera
> Priority: Major
> Labels: DR, Replication
>
> REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular
> expression + inclusion/exclusion list. So, in case of rename table event, the
> event will be ignored if old table doesn't match the pattern but the new
> table should be bootstrapped. REPL DUMP should have a mechanism to detect
> such tables and automatically bootstrap with incremental replication.Also, if
> renamed table is excluded from replication policy, then need to drop the old
> table at target as well.
>
> There are 4 scenarios that needs to be handled.
> # Both new name and old name satisfies the table name pattern filter.
> * No need to do any thing. The incremental event for rename should take care
> of the replication.
> # Both the names does not satisfy the table name pattern filter.
> * Both the names are not in the scope of the policy and this nothing needs
> to be done.
> # New name satisfies the pattern but the old name does not.
> * The table will not be present at the target.
> * Rename event handler for dump should detect this case and add the new
> table name to the list of table for bootstrap.
> * All the events related to the table (new name) should be ignored.
> * If there is a drop event for the table (with new name), then remove the
> table from the the list of table to be bootstrapped.
> * In case of rename (double rename)
> * If the new name satisfies the table pattern, then add the new name to the
> list of tables to be bootstrapped and remove the old name from the list of
> tables to be bootstrapped.
> * If the new name does not satisfies then just removed the table name from
> the list of tables to be bootstrapped.
> # New name does not satisfies the pattern but the old name satisfies.
> * Change the rename event to a drop event.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)