@Lei, Since current hawq register supports two cases: Specifying tablename & filepath, or specifying .yml file, we proposed this usage.
For the first case, we can follow the patten of "hawq command". change the tablename to object, such as *hawq register [-h hostname] [-p port] [-U username] [-d database] [-f filepath] [-c config] <tablename>* But for the second case, because the table name is inside .yml file, if we change tablename to object and mark it as a necessary field, it's duplicated with the name inside configure file. And what shall we do for the name conflicts? Could you suggest a better way for defining the usage? Thanks :) Thanks Lili On Tue, Aug 16, 2016 at 8:23 AM, Lei Chang <[email protected]> wrote: > I think this is an very useful feature for backup/restore, disaster > recovery and some other scenarios. > > From the usage side, "hawq register" follows the typically "hawq command" > design pattern: that is, "hawq action object". But for "hawq register", > there is no "object" here. > > --------------------------- > hawq extract -o t1.yml t1; > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml; > --------------------------- > > Cheers > Lei > > > On Mon, Aug 15, 2016 at 3:25 PM, Hong Wu <[email protected]> wrote: > > > Hi HAWQ developers, > > > > This thread means to confirm the option usage of hawq register. > > > > There will be two scenarios for users to use the hawq register tool so > far. > > - I. Register external parquet data into HAWQ. For example, users want to > > migrate parquet tables from HIVE to HAWQ as quick as possible. In this > > case, only parquet format is supported and the original parquet files in > > hive are moved. > > > > - II. User should be able to use hawq register to register table files > into > > a new HAWQ cluster. It is some kind of protecting against corruption from > > users' perspective. Users use the last-known-good metadata to update the > > portion of catalog managing HDFS blocks. The table files or dictionary > > should be backuped(such as using distcp) into the same path in the new > HDFS > > setting. And in this case, both AO and Parquet formats are supported. > > > > Considering above cases, the designed options for hawq register looks > > below: > > > > hawq register [-h hostname] [-p port] [-U username] [-d database] [-t > > tablename] [-f filepath] [-c config] > > Note that the -h, p, -U options are optional, the -c option and the -t, > -f > > options are mutually exclusive which are corresponding to two different > > cases above. Consequently, the expected usage of hawq register should be > > like below: > > > > - Case I > > hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq > > hdfs://localhost:8020/test_data.paq; > > > > create table t1(i int) with (appendonly = true, orientation=parquer); > > > > hawq register -h localhost -p 5432 -u me -d postgres -t t1 -f > > hdfs://localhost:8020/test_data.paq; > > > > - Case II > > hawq extract -o t1.yml t1; > > > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml; > > > > Incorrect usage(in both of these cases, hawq resgiter will print an error > > and then exit): > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1; > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -f > > hdfs://localhost:8020/test_data.paq; > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1 -f > > hdfs://localhost:8020/test_data.paq; > > > > Does this design make sense, any comments? Thanks. > > > > Best > > Hong > > >
