@Lei, I noticed that and thanks for pointing it out. The updated interface should be like this:
- Case I hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq hdfs://localhost:8020/test_data.paq; create table t1(i int) with (appendonly = true, orientation=parquet); hawq register -h localhost -p 5432 -u me -d postgres -f hdfs://localhost:8020/test_data.paq t1; - Case II hawq extract -o t1.yml t1; hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml t1; Cheers Hong 2016-08-16 11:34 GMT+08:00 Lili Ma <[email protected]>: > @Lei, Since current hawq register supports two cases: Specifying tablename > & filepath, or specifying .yml file, we proposed this usage. > > For the first case, we can follow the patten of "hawq command". change the > tablename to object, such as *hawq register [-h hostname] [-p port] [-U > username] [-d database] [-f filepath] [-c config] <tablename>* > > But for the second case, because the table name is inside .yml file, if we > change tablename to object and mark it as a necessary field, it's > duplicated with the name inside configure file. And what shall we do for > the name conflicts? > > Could you suggest a better way for defining the usage? Thanks :) > > Thanks > Lili > > On Tue, Aug 16, 2016 at 8:23 AM, Lei Chang <[email protected]> wrote: > > > I think this is an very useful feature for backup/restore, disaster > > recovery and some other scenarios. > > > > From the usage side, "hawq register" follows the typically "hawq command" > > design pattern: that is, "hawq action object". But for "hawq register", > > there is no "object" here. > > > > --------------------------- > > hawq extract -o t1.yml t1; > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml; > > --------------------------- > > > > Cheers > > Lei > > > > > > On Mon, Aug 15, 2016 at 3:25 PM, Hong Wu <[email protected]> wrote: > > > > > Hi HAWQ developers, > > > > > > This thread means to confirm the option usage of hawq register. > > > > > > There will be two scenarios for users to use the hawq register tool so > > far. > > > - I. Register external parquet data into HAWQ. For example, users want > to > > > migrate parquet tables from HIVE to HAWQ as quick as possible. In this > > > case, only parquet format is supported and the original parquet files > in > > > hive are moved. > > > > > > - II. User should be able to use hawq register to register table files > > into > > > a new HAWQ cluster. It is some kind of protecting against corruption > from > > > users' perspective. Users use the last-known-good metadata to update > the > > > portion of catalog managing HDFS blocks. The table files or dictionary > > > should be backuped(such as using distcp) into the same path in the new > > HDFS > > > setting. And in this case, both AO and Parquet formats are supported. > > > > > > Considering above cases, the designed options for hawq register looks > > > below: > > > > > > hawq register [-h hostname] [-p port] [-U username] [-d database] [-t > > > tablename] [-f filepath] [-c config] > > > Note that the -h, p, -U options are optional, the -c option and the -t, > > -f > > > options are mutually exclusive which are corresponding to two different > > > cases above. Consequently, the expected usage of hawq register should > be > > > like below: > > > > > > - Case I > > > hadoop fs -put -f hdfs://localhost:8020/hive/original_data.paq > > > hdfs://localhost:8020/test_data.paq; > > > > > > create table t1(i int) with (appendonly = true, orientation=parquer); > > > > > > hawq register -h localhost -p 5432 -u me -d postgres -t t1 -f > > > hdfs://localhost:8020/test_data.paq; > > > > > > - Case II > > > hawq extract -o t1.yml t1; > > > > > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml; > > > > > > Incorrect usage(in both of these cases, hawq resgiter will print an > error > > > and then exit): > > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1; > > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -f > > > hdfs://localhost:8020/test_data.paq; > > > hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml -t t1 -f > > > hdfs://localhost:8020/test_data.paq; > > > > > > Does this design make sense, any comments? Thanks. > > > > > > Best > > > Hong > > > > > >
