I think this is an very useful feature for backup/restore, disaster recovery and some other scenarios.
>From the usage side, "hawq register" follows the typically "hawq command" design pattern: that is, "hawq action object". But for "hawq register", there is no "object" here. --------------------------- hawq extract -o t1.yml t1; hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml; --------------------------- Cheers Lei On Mon, Aug 15, 2016 at 3:25 PM, Hong Wu <[email protected]> wrote: > Hi HAWQ developers, > > This thread means to confirm the option usage of hawq register. > > There will be two scenarios for users to use the hawq register tool so far. > - I. Register external parquet data into HAWQ. For example, users want to > migrate parquet tables from HIVE to HAWQ as quick as possible. In this > case, only parquet format is supported and the original parquet files in > hive are moved. > > - II. User should be able to use hawq register to register table files into > a new HAWQ cluster. It is some kind of protecting against corruption from > users' perspective. Users use the last-known-good metadata to update the > portion of catalog managing HDFS blocks. The table files or dictionary > should be backuped(such as using distcp) into the same path in the new HDFS > setting. And in this case, both AO and Parquet formats are supported. > > Considering above cases, the designed options for hawq register looks > below: > > hawq register [-h hostname] [-p port] [-U username] [-d database] [-t > tablename] [-f filepath] [-c config] > Note that the -h, p, -U options are optional, the -c option and the -t, -f > options are mutually exclusive which are corresponding to two different > cases above. Consequently, the expected usage of hawq register should be > like below: > > - Case I > hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq > hdfs://localhost:8020/test_data.paq; > > create table t1(i int) with (appendonly = true, orientation=parquer); > > hawq register -h localhost -p 5432 -u me -d postgres -t t1 -f > hdfs://localhost:8020/test_data.paq; > > - Case II > hawq extract -o t1.yml t1; > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml; > > Incorrect usage(in both of these cases, hawq resgiter will print an error > and then exit): > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1; > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -f > hdfs://localhost:8020/test_data.paq; > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1 -f > hdfs://localhost:8020/test_data.paq; > > Does this design make sense, any comments? Thanks. > > Best > Hong >
