[ 
https://issues.apache.org/jira/browse/HAWQ-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085217#comment-15085217
 ] 

Lei Chang commented on HAWQ-311:
--------------------------------

Good thought.

I think there are a lot of cases that need to considered. for example, the two 
cluster share a hdfs instance, or two cluster uses separate hdfs clusters. 
different cases might need different design options which might have 
significant performance differences.

For reference, gpdb gptransfer 
(http://gpdb.docs.pivotal.io/4360/install_guide/refs/gptransfer.html) is a 
similar tool that transfers data between gpdb clusters. But I think in hawq, 
the implementation can be very different due to the differences at storage 
layer, external table/pxf, and direct hdfs data file drop & load to native HAWQ 
tables feature of hawq (HAWQ-306).

A high level design doc with more details will be very helpful for further 
discussion: for example, the user interface and the possible design options.

> Data Transfer tool
> ------------------
>
>                 Key: HAWQ-311
>                 URL: https://issues.apache.org/jira/browse/HAWQ-311
>             Project: Apache HAWQ
>          Issue Type: New Feature
>          Components: Storage
>            Reporter: Lei Chang
>            Assignee: NILESH MANKAR
>             Fix For: 2.1.0
>
>
> Some users asked a tool to transfer data between HAWQ clusters. It is quite 
> useful for data migration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to