Repository: incubator-hawq-docs Updated Branches: refs/heads/develop 5673447e0 -> 01f3f8e9d
Updates for register --repair, partitioning Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/baaf05f1 Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/baaf05f1 Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/baaf05f1 Branch: refs/heads/develop Commit: baaf05f1455342f19c09c9de411b0be19b864b65 Parents: e169704 Author: Jane Beckman <[email protected]> Authored: Wed Oct 5 15:44:07 2016 -0700 Committer: Jane Beckman <[email protected]> Committed: Wed Oct 5 15:44:07 2016 -0700 ---------------------------------------------------------------------- datamgmt/load/g-register_files.html.md.erb | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/baaf05f1/datamgmt/load/g-register_files.html.md.erb ---------------------------------------------------------------------- diff --git a/datamgmt/load/g-register_files.html.md.erb b/datamgmt/load/g-register_files.html.md.erb index 93625f1..f9c407d 100644 --- a/datamgmt/load/g-register_files.html.md.erb +++ b/datamgmt/load/g-register_files.html.md.erb @@ -22,7 +22,7 @@ Requirements for running `hawq register` on the server are: Files or folders in HDFS can be registered into an existing table, allowing them to be managed as a HAWQ internal table. When registering files, you can optionally specify the maximum amount of data to be loaded, in bytes, using the `--eof` option. If registering a folder, the actual file sizes are used. -Only HAWQ or Hive-generated Parquet tables are supported. Partitioned tables are not supported. Attempting to register these tables will result in an error. +Only HAWQ or Hive-generated Parquet tables are supported. Only single-level partitioned tables are supported; registering partitioned tables with more than one level will result in an error. Metadata for the Parquet file(s) and the destination table must be consistent. Different data types are used by HAWQ tables and Parquet files, so data must be mapped. You must verify that the structure of the parquet files and the HAWQ table are compatible before running `hawq register`. @@ -66,7 +66,7 @@ select relname from pg_class where oid = segrelid ## <a id="topic1__section3"></a>Registering Data Using Information from a YAML Configuration File -The `hawq register` command can register HDFS files by using metadata loaded from a YAML configuration file by using the `--config <yaml_config\>` option. Both AO and Parquet tables can be registered. Tables need not exist in HAWQ before being registered. This function can be useful in disaster recovery, allowing information created by the `hawq extract` command to re-create HAWQ tables. +The `hawq register` command can register HDFS files by using metadata loaded from a YAML configuration file by using the `--config <yaml_config\>` option. Both AO and Parquet tables can be registered. Tables need not exist in HAWQ before being registered. In disaster recovery, information in a YAML-format file created by the `hawq extract` command can re-create HAWQ tables by using metadata from a backup checkpoint. You can also use a YAML confguration file to append HDFS files to an existing HAWQ table or create a table and register it into HAWQ. @@ -77,6 +77,7 @@ Data is registered according to the following conditions: - Existing tables have files appended to the existing HAWQ table. - If a table does not exist, it is created and registered into HAWQ. The catalog table will be updated with the file size specified by the YAML file. - If the -\\\-force option is used, the data in existing catalog tables is erased and re-registered. All HDFS-related catalog contents in `pg_aoseg.pg_paqseg_$relid ` are cleared. The original files on HDFS are retained. +- The -\\\-repair option rolls data back to a specified checkpoint. If the table already exists, both the file folder and `pg_aoseg.pg_paqseg_$relid` catalog entry are folled back to the checkpoint configuration in the YAML file. Files generated after the timestamp of the checkpoint will be deleted. Hash table redistribution, table truncate, and table drop are not supported. Using the -\\\- repair option with redistributed table data will result in an error. Tables using random distribution are preferred for registering into HAWQ. If hash tables are to be registered, the distribution policy in the YAML file must match that of the table being registered into.
