[jira] [Commented] (HIVE-24852) Add support for Snapshots during external table replication

2021-11-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436773#comment-17436773
 ] 

Ayush Saxena commented on HIVE-24852:
-

Hey [~ste...@apache.org], Quite surprised to see you here. :) 
{quote}1. Does this downgrade properly when the destination FS is not hdfs?
{quote}
This feature isn't enabled by default, You need to enable it via a config. So, 
If initially the target doesn't support Snapshots, and you enable use of 
snapshots for external tables, The Replication will fail with a non-recoverable 
error. So, Admin has to disable that and restart the Replication, Everything 
works normally post that.

 

If I catch your question correctly, You mean earlier the replication flow was 
Cluster-A->Cluster-B, both being HDFS(On-Prem to On-Prem Replication) and later 
if Cluster-B migrates the FileSystem from HDFS to some other FS which doesn't 
support Snapshots. Does this work or not.

--> So, In that case you can simply turn off the use of snapshots for 
replication, and it will start using the normal mode of replication. We will do 
the cleanup of snaphsots and will start doing a normal distcp. AFAIK there 
isn't any limitation from DistCp side that if any directory was copied using 
-diff can not be copied again using normal distcp -update -delete.

 
{quote}2. has anyone discussed with the HDFS team the possibility of providing 
an interface in hadoop-common for this?
{quote}
Atleast I haven't done that. May be [~aasha] or [~anishek] might throw some 
light on that.

But just out of curiosity how we can manage this through {{Hadoop-Common}}, 
something like adding a copy method in {{FileUtils}}?, But creation & deletion 
of snapshots we still have to do ourselves right? During dump create snapshots 
on Source cluster and then post copy delete & recreate the snapshots on Target 
cluster and stuffs like that. Operation on source cluster will be done as part 
of DUMP Policy running on the {{Source}} cluster & Operations on the {{Target}} 
Cluster would be done as part of the LOAD policy running on the target cluster, 
both running independently on different cluster at different time(_In 
synchronised manner_). So, what we can extract here to {{Hadoop-Common}}? Can 
you share some pointers, I can give it a try

 

> Add support for Snapshots during external table replication
> ---
>
> Key: HIVE-24852
> URL: https://issues.apache.org/jira/browse/HIVE-24852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Design Doc HDFS Snapshots for External Table 
> Replication-01.pdf, Design Doc HDFS Snapshots for External Table 
> Replication-02.pdf
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Add support for use of snapshot diff for external table replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24852) Add support for Snapshots during external table replication

2021-11-01 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436757#comment-17436757
 ] 

Steve Loughran commented on HIVE-24852:
---

# Does this downgrade properly when the destination FS is not hdfs?
# has anyone discussed with the HDFS team the possibility of providing an 
interface in hadoop-common for this?



> Add support for Snapshots during external table replication
> ---
>
> Key: HIVE-24852
> URL: https://issues.apache.org/jira/browse/HIVE-24852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Design Doc HDFS Snapshots for External Table 
> Replication-01.pdf, Design Doc HDFS Snapshots for External Table 
> Replication-02.pdf
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Add support for use of snapshot diff for external table replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24852) Add support for Snapshots during external table replication

2021-05-18 Thread Aasha Medhi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347034#comment-17347034
 ] 

Aasha Medhi commented on HIVE-24852:


+1 Committed to master. Thank you for the patch [~ayushtkn]

> Add support for Snapshots during external table replication
> ---
>
> Key: HIVE-24852
> URL: https://issues.apache.org/jira/browse/HIVE-24852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Design Doc HDFS Snapshots for External Table 
> Replication-01.pdf
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Add support for use of snapshot diff for external table replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)