[ 
https://issues.apache.org/jira/browse/HIVE-21763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21763:
------------------------------------
    Status: Patch Available  (was: Open)

> Incremental replication to allow changing include/exclude tables list in 
> replication policy.
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21763
>                 URL: https://issues.apache.org/jira/browse/HIVE-21763
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: DR, Replication, pull-request-available
>         Attachments: HIVE-21763.01.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> - REPL DUMP takes 2 inputs along with existing FROM and WITH clause.
> {code}
> - REPL DUMP <current_repl_policy> [REPLACE <previous_repl_policy> FROM 
> <last_repl_id> WITH <key_values_list>;
> - current_repl_policy and previous_repl_policy can be any format mentioned in 
> Point-4.
> - REPLACE clause to be supported to take previous repl policy as input. If 
> REPLACE clause is not there, then the policy remains unchanged.
> - Rest of the format remains same.
> {code}
> - Now, REPL DUMP on this DB will replicate the tables based on 
> current_repl_policy.
> - Single table replication of format <db_name>.t1 doesn’t allow changing the 
> policy dynamically. So REPLACE clause is not allowed if previous_repl_policy 
> of this format.
> - If any table is added dynamically either due to change in regular 
> expression or added to include list should be bootstrapped using independant 
> table level replication policy.
> {code}
> - Hive will automatically figure out the list of tables newly included in the 
> list by comparing the current_repl_policy & previous_repl_policy inputs and 
> combine bootstrap dump for added tables as part of incremental dump. 
> "_bootstrap" directory can be created in dump dir to accommodate all tables 
> to be bootstrapped.
> - If any table is renamed, then it may gets dynamically added/removed for 
> replication based on defined replication policy + include/exclude list. So, 
> Hive will perform bootstrap for the table which is just included after rename.
> {code}
> - REPL LOAD should check for changes in repl policy and drop the tables/views 
> excluded in the new policy  compared to previous policy. It should be done 
> before performing incremental and bootstrap load from the current dump.
> - REPL LOAD on incremental dump should load events directories first and then 
> check for "_bootstrap" directory and perform bootstrap load on them.
> Rename table is not in scope of this jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to