[
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
anishek reassigned HIVE-20911:
------------------------------
Assignee: anishek
> External Table Replication for Hive
> -----------------------------------
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 4.0.0
> Reporter: anishek
> Assignee: anishek
> Priority: Critical
> Fix For: 4.0.0
>
>
> External tables are not replicated currently as part of hive replication. As
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be
> used to copy all data relevant to external tables. This will be provided via
> the *with* clause in the *repl load* command. This base path will be prefixed
> to the path of the same external table on source cluster.
> * Since changes to directories on the external table can happen without hive
> knowing it, hence we cant capture the relevant events when ever new data is
> added or removed, we will have to copy the data from the source path to
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump* to now create an additional
> file *\_external\_tables\_info* with data in the following form
> {code}
> OpearationType,tableName,base64Encoded(tableDataLocation)
> {code}
> where OpeartionType can be one in (ADD, REMOVE)
> ** *repl load* will look up all the external tables on target and remove
> tables listed with REMOVE type in the above file.
> ** For the remaining tables it will create tasks for the corresponding paths
> from source to target along with the existing tasks for incremental load.
> * New External tables will be created with data copied as part of regular
> tasks wile incremental load, applying the base directory prefix
> * Bootstrap will also create / copy these external tables as part of their
> regular workflow, applying the base directory prefix
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)