[ 
https://issues.apache.org/jira/browse/HIVE-16859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-16859:
---------------------------
    Description: 
Currently for hive replication, the cm root uri is configured via 
"hive.repl.cmrootdir". This configuration needs to have the same value on both 
the primary and replica hive warehouse. 

CM uri should be encoded such that the cm root of the source should be part of 
the URI itself. so the cmfs uri's should be following

{code}
cmfs:hdfs://[authority]/[actual_location]#[checksum_of_file]_[encoded_cm_root_on_primary]
{code}

so that we can detect what is the root location of the source cm root at any 
target replica warehouse. Since the filesystem configurations can be different 
for the  primary and replica warehouse there might be additional configurations 
will be required to create {{FileSystem}} objects to talk to respective 
filesystems. if we want to support that we can add an additional configuration 
stating the primary cm root location on the replica warehouse along with other 
fs related configurations and in that case this bug might be irrelevant.


  was:
Currently for hive replication, the cm root uri is configured via 
"hive.repl.cmrootdir". This configuration needs to have the same value on both 
the primary and replica hive warehouse. 

CM uri should be encoded such that the cm root of the source should be part of 
the URI itself. so the cmfs uri's should be following

{code
}cmfs:hdfs://[authority]/[actual_location]#[checksum_of_file]_[encoded_cm_root_on_primary]
{code}

so that we can detect what is the root location of the source cm root at any 
target replica warehouse. Since the filesystem configurations can be different 
for the  primary and replica warehouse there might be additional configurations 
will be required to create {{FileSystem}} objects to talk to respective 
filesystems. if we want to support that we can add an additional configuration 
stating the primary cm root location on the replica warehouse along with other 
fs related configurations and in that case this bug might be irrelevant.



> CM uri encoding
> ---------------
>
>                 Key: HIVE-16859
>                 URL: https://issues.apache.org/jira/browse/HIVE-16859
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: anishek
>            Assignee: anishek
>
> Currently for hive replication, the cm root uri is configured via 
> "hive.repl.cmrootdir". This configuration needs to have the same value on 
> both the primary and replica hive warehouse. 
> CM uri should be encoded such that the cm root of the source should be part 
> of the URI itself. so the cmfs uri's should be following
> {code}
> cmfs:hdfs://[authority]/[actual_location]#[checksum_of_file]_[encoded_cm_root_on_primary]
> {code}
> so that we can detect what is the root location of the source cm root at any 
> target replica warehouse. Since the filesystem configurations can be 
> different for the  primary and replica warehouse there might be additional 
> configurations will be required to create {{FileSystem}} objects to talk to 
> respective filesystems. if we want to support that we can add an additional 
> configuration stating the primary cm root location on the replica warehouse 
> along with other fs related configurations and in that case this bug might be 
> irrelevant.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to