[
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15890537#comment-15890537
]
Kyle R Dunn commented on HAWQ-760:
----------------------------------
[~lilima] - How does hawq register change or handle trying to register files
from different versions where the catalog could have changed? i.e. what would
happen if I try to register tables from hawq 1.x into hawq 2.x?
> Hawq register
> -------------
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
> Issue Type: New Feature
> Components: Command Line Tools
> Reporter: Yangcheng Luo
> Assignee: Lili Ma
> Fix For: backlog
>
>
> Scenario:
> 1. Register a parquet file generated by other systems, such as Hive, Spark,
> etc.
> 2. For cluster Disaster Recovery. Two clusters co-exist, periodically import
> data from Cluster A to Cluster B. Need Register data to Cluster B.
> 3. For the rollback of table. Do checkpoints somewhere, and need to rollback
> to previous checkpoint.
> Usage1
> Description
> Register a file/folder to an existing table. Can register a file or a folder.
> If we register a file, can specify eof of this file. If eof not specified,
> directly use actual file size. If we register a folder, directly use actual
> file size.
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-f
> filepath] [-e eof]<tablename>
> Usage 2
> Description
> Register according to .yml configuration file.
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c
> config] [--force][--repair] <tablename>
> Behavior:
> 1. If table doesn't exist, will automatically create the table and register
> the files in .yml configuration file. Will use the filesize specified in .yml
> to update the catalog table.
> 2. If table already exist, and neither --force nor --repair configured. Do
> not create any table, and directly register the files specified in .yml file
> to the table. Note that if the file is under table directory in HDFS, will
> throw error, say, to-be-registered files should not under the table path.
> 3. If table already exist, and --force is specified. Will clear all the
> catalog contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS,
> and then re-register all the files to the table. This is for scenario 2.
> 4. If table already exist, and --repair is specified. Will change both file
> folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml
> file configures. Note may some new generated files since the checkpoint may
> be deleted here. Also note the all the files in .yml file should all under
> the table folder on HDFS. Limitation: Do not support cases for hash table
> redistribution, table truncate and table drop. This is for scenario 3.
> Requirements for both the cases:
> 1. To be registered file path has to colocate with HAWQ in the same HDFS
> cluster.
> 2. If to be registered is a hash table, the registered file number should be
> one or multiple times or hash table bucket number.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)