Pravin Sinha created HIVE-24187:
-----------------------------------

             Summary: Handle _files creation for HA config with same 
nameservice on source and destination
                 Key: HIVE-24187
                 URL: https://issues.apache.org/jira/browse/HIVE-24187
             Project: Hive
          Issue Type: Improvement
            Reporter: Pravin Sinha
            Assignee: Pravin Sinha


Current HA is supported only for different nameservices on Source and 
Destination. We need to add support of same nameservice on Source and 
Destination.
Local nameservice will be passed correctly to the repl command.
Remote nameservice will be a random name and corresponding configs for the same.

Example:
Clusters originally configured with ns for hdfs:
src: ns1
target : ns1

We can denote remote name with some random name, say for example: nsRemote. 
This is how the command will see the ns w.r.t source and target:

Repl Dump : src: ns1, target: nsRemote
Repl Load: src: nsRemote, target: ns1

Entries in the _files(for managed table data loc) will be made with nsRemote in 
stead of ns1(for src).
Example: 
hdfs://nsRemote/whLoc/dbName.db/table1:checksum:subDir:hdfs://nsRemote/cmroot

Same way list of external table data locations will also be modified using 
nsRemote in stead of ns1(for src).

New configs can control the behavior:
*hive.repl.ha.datapath.replace.remote.nameservice = <boolean>*
*hive.repl.ha.datapath.replace.remote.nameservice.name = <string>*

Based on the above configs replacement of nameservice can be done.

This will also require that 'hive.repl.rootdir' is passed accordingly during 
dump and load:
Repl dump:
||Repl Operation||Repl Command||
|*Staging on source cluster*|
|Repl Dump|repl dump dbName with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|
|Repl Load|repl load dbName into dbName 
with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
|*Staging on target cluster*|
|Repl Dump|repl dump dbName 
with('hive.repl.rootdir'='hdfs://nsRemote/stagingLoc')|
|Repl Load|repl load dbName into dbName 
with('hive.repl.rootdir'='hdfs://ns1/stagingLoc')|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to