Github user ictmalili commented on a diff in the pull request:
https://github.com/apache/incubator-hawq/pull/883#discussion_r80218098
--- Diff: tools/doc/hawqregister_help ---
@@ -37,10 +35,19 @@ The file(s) to be registered and the table in HAWQ must
be in the
same HDFS cluster.
Use Case2:
-User should be able to use hawq register to register table files into a
new HAWQ cluster.
-It is some kind of protecting against corruption from users' perspective.
-Users use the last-known-good metadata to update the portion of catalog
managing HDFS blocks.
-The table files or dictionary should be backuped(such as using distcp)
into the same path in the new HDFS setting.
+Hawq register can register both AO and parquet format table, and the files
to be registered are listed in the .yml configuration file.
+This configuration file can be generated by hawq extract. Register through
.yml configuration doesnât require the table already exist,
+since .yml file contains table schema already.
+HAWQ register behaviors differently with different options:
+ * If the table does not exist, hawq register will create table and do
register.
+ * If table already exist, hawq register will append the files to the
existing table.
+ * If --force option specified, hawq register will erase existing catalog
+ table pg_aoseg.pg_aoseg_$relid/pg_aoseg.pg_paqseg_$relid data for the
table and
+ re-register according to .yml configuration file definition. Note. If
there are
+ files under table directory which are not specified in .yml
configuration file, it will throw error out.
+Note. Without --force specified, if some file specified in .yml
configuration file lie under the table directory, hawq register will throw
error out.
+Note. With --force option specified, if there are files under table
directory which are not specified in .yml configuration file, hawq register
will throw error out.
+Note. For both the use cases of hawq register, if the table is hash
distributed, hawq register just check the file number to be registered has to
be integral multiple multiple times of this tableâs bucket number, and check
whether the distribution key specified in .yml configuration file is same as
that of table. It does not check whether files are actually distributed by the
key.
--- End diff --
We
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---