[ 
https://issues.apache.org/jira/browse/HIVE-20968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20968:
---------------------------------------
    Attachment: HIVE-20968.01.patch

> Support conversion of managed to external where location set was not owned by 
> hive
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-20968
>                 URL: https://issues.apache.org/jira/browse/HIVE-20968
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: mahesh kumar behera
>            Assignee: mahesh kumar behera
>            Priority: Major
>              Labels: DR, pull-request-available
>         Attachments: HIVE-20968.01.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> As per migration rule, if a location is outside the default managed table 
> directory and the location is not owned by "hive" user, then it should be 
> converted to external table after upgrade.
>  So, the same rule is applicable for Hive replication where the data of 
> source managed table is residing outside the default warehouse directory and 
> is not owned by "hive" user.
>  During this conversion, the path should be preserved in target as well so 
> that failover works seamlessly.
>  # If the table location is out side hive warehouse and is not owned by hive, 
> then the table at target will be converted to external table. But the 
> location can not be retained , it will be retained relative to hive external 
> warehouse directory. 
>  #  As the table is not an external table at source, only those data which 
> are added using events will be replicated.
>  # The ownership of the location will be stored in the create table event and 
> will be used to compare it with strict.managed.tables.migration.owner to 
> decide if the flag in replication scope can be set. This flag is used to 
> convert the managed table to external table at target.
> Some of the scenarios needs to be blocked if the database is set for 
> replication from a cluster with non strict managed table setting to strict 
> managed table.
> 1. Block alter table / partition set location for database with source of 
> replication set for managed tables
> 2. If user manually changes the ownership of the location, hive replication 
> may go to a non recoverable state.
> 3. Block add partition if the location ownership is different than table 
> location for managed tables.
> 4. User needs to set strict.managed.tables.migration.owner along with dump 
> command (default to hive user). This value will be used during dump to decide 
> the ownership which will be used during load to decide the table type. The 
> location owner information can be stored in the events during create table. 
> The flag can be stored in replication spec. Check other such configs used in 
> upgrade tool.
> 5. Replication flow also set additional parameter 
> "external.table.purge"="true" ..only for migration to external table
> 6. Block conversion from managed to external and vice versa. Pass some flag 
> in upgrade flow to allow this conversion during upgrade flow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to