Repository: incubator-hawq-docs Updated Branches: refs/heads/develop da24a44fc -> 25242858c
HAWQ-1209 - hawq overview page (closes #80) Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/25242858 Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/25242858 Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/25242858 Branch: refs/heads/develop Commit: 25242858cdf072801ebd161721e726b814161b40 Parents: da24a44 Author: Lisa Owen <[email protected]> Authored: Fri Jan 6 09:01:56 2017 -0800 Committer: David Yozie <[email protected]> Committed: Fri Jan 6 09:01:56 2017 -0800 ---------------------------------------------------------------------- admin/RunningHAWQ.html.md.erb | 51 ++++++++++++++-------- admin/setuphawqopenv.html.md.erb | 81 +++++++++++++++++++++++++++++++++++ 2 files changed, 114 insertions(+), 18 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/25242858/admin/RunningHAWQ.html.md.erb ---------------------------------------------------------------------- diff --git a/admin/RunningHAWQ.html.md.erb b/admin/RunningHAWQ.html.md.erb index 65b3f31..c7de1d5 100644 --- a/admin/RunningHAWQ.html.md.erb +++ b/admin/RunningHAWQ.html.md.erb @@ -2,21 +2,36 @@ title: Running a HAWQ Cluster --- -This section provides information for system administrators and database superusers responsible for administering a HAWQ system. - -This guide provides information and instructions for configuring, maintaining and using a HAWQ system. This guide is intended for system and database administrators responsible for managing a HAWQ system. - -This guide assumes knowledge of Linux/UNIX system administration, database management systems, database administration, and structured query language \(SQL\). Because HAWQ is based on PostgreSQL 8.2.15, this guide assumes some familiarity with PostgreSQL. This guide calls out similarities between HAWQ and PostgreSQL features throughout. It contains the topics: - -* <a class="subnav" href="./ambari-admin.html">Managing HAWQ Using Ambari</a> -* <a class="subnav" href="./startstop.html">Starting and Stopping HAWQ</a> -* <a class="subnav" href="./ClusterExpansion.html">Expanding a Cluster</a> -* <a class="subnav" href="./ClusterShrink.html">Removing a Node</a> -* <a class="subnav" href="./BackingUpandRestoringHAWQDatabases.html">Backing Up and Restoring HAWQ</a> -* <a class="subnav" href="./HighAvailability.html">High Availability in HAWQ</a> -* <a class="subnav" href="./MasterMirroring.html">Master Mirroring</a> -* <a class="subnav" href="./HAWQFilespacesandHighAvailabilityEnabledHDFS.html">HAWQ Filespaces and High Availability Enabled HDFS</a> -* <a class="subnav" href="./FaultTolerance.html">Understanding the Fault Tolerance Service</a> -* <a class="subnav" href="./RecommendedMonitoringTasks.html">Recommended Monitoring and Maintenance Tasks</a> -* <a class="subnav" href="./maintain.html">Routine System Maintenance Tasks</a> -* <a class="subnav" href="./monitor.html">Monitoring a HAWQ System</a> +This section provides information for system administrators responsible for administering a HAWQ deployment. + +You should have some knowledge of Linux/UNIX system administration, database management systems, database administration, and structured query language \(SQL\) to administer a HAWQ cluster. Because HAWQ is based on PostgreSQL, you should also have some familiarity with PostgreSQL. The HAWQ documentation calls out similarities between HAWQ and PostgreSQL features throughout. + +## <a id="hawq_users"></a>HAWQ Users + +HAWQ supports users with both administrative and operating privileges. The HAWQ administrator may choose to manage the HAWQ cluster using either Ambari or the command line. [Managing HAWQ Using Ambari](../admin/ambari-admin.html) provides Ambari-specific HAWQ cluster administration procedures. [Starting and Stopping HAWQ](startstop.html), [Expanding a Cluster](ClusterExpansion.html), and [Removing a Node](ClusterShrink.html) describe specific command-line-managed HAWQ cluster administration procedures. Other topics in this guide are applicable to both Ambari- and command-line-managed HAWQ clusters. + +The default HAWQ admininstrator user is named `gpadmin`. The HAWQ admin may choose to assign administrative and/or operating HAWQ privileges to additional users. Refer to [Configuring Client Authentication](../clientaccess/client_auth.html) and [Managing Roles and Privileges](../clientaccess/roles_privs.html) for additional information about HAWQ user configuration. + +## <a id="hawq_systems"></a>HAWQ Deployment Systems + +A typical HAWQ deployment includes single HDFS and HAWQ master and standby nodes and multiple HAWQ segment and HDFS data nodes. The HAWQ cluster may also include systems running the HAWQ Extension Framework (PXF) and other Hadoop services. Refer to [HAWQ Architecture](../overview/HAWQArchitecture.html) and [Select HAWQ Host Machines](../install/select-hosts.html) for information about the different systems in a HAWQ deployment and how they are configured. + + +## <a id="hawq_env_databases"></a>HAWQ Databases + +[Creating and Managing Databases](../ddl/ddl-database.html) and [Creating and Managing Tables](../ddl/ddl-table.html) describe HAWQ database and table creation commands. + +You manage HAWQ databases at the command line using the [psql](../reference/cli/client_utilities/psql.html) utility, an interactive front-end to the HAWQ database. Configuring client access to HAWQ databases and tables may require information related to [Establishing a Database Session](../clientaccess/g-establishing-a-database-session.html). + +[HAWQ Database Drivers and APIs](../clientaccess/g-database-application-interfaces.html) identifies supported HAWQ database drivers and APIs for additional client access methods. + +## <a id="hawq_env_data"></a>HAWQ Data + +HAWQ internal data resides in HDFS. You may require access to data in different formats and locations in your data lake. You can use HAWQ and the HAWQ Extension Framework (PXF) to access and manage both internal and this external data: + +- [Managing Data with HAWQ](../datamgmt/dml.html) discusses the basic data operations and details regarding the loading and unloading semantics for HAWQ internal tables. +- [Using PXF with Unmanaged Data](../pxf/HawqExtensionFrameworkPXF.html) describes PXF, an extensible framework you may use to query data external to HAWQ. + +## <a id="hawq_env_setup"></a>HAWQ Operating Environment + +Refer to [Introducing the HAWQ Operating Environment](setuphawqopenv.html) for a discussion of the HAWQ operating environment, including a procedure to set up the HAWQ environment. This section also provides an introduction to the important files and directories in a HAWQ installation. http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/25242858/admin/setuphawqopenv.html.md.erb ---------------------------------------------------------------------- diff --git a/admin/setuphawqopenv.html.md.erb b/admin/setuphawqopenv.html.md.erb new file mode 100644 index 0000000..9d9b731 --- /dev/null +++ b/admin/setuphawqopenv.html.md.erb @@ -0,0 +1,81 @@ +--- +title: Introducing the HAWQ Operating Environment +--- + +Before invoking operations on a HAWQ cluster, you must set up your HAWQ environment. This set up is required for both administrative and non-administrative HAWQ users. + +## <a id="hawq_setupenv"></a>Procedure: Setting Up Your HAWQ Operating Environment + +HAWQ installs a script that you can use to set up your HAWQ cluster environment. The `greenplum_path.sh` script, located in your HAWQ root install directory, sets `$PATH` and other environment variables to find HAWQ files. Most importantly, `greenplum_path.sh` sets the `$GPHOME` environment variable to point to the root directory of the HAWQ installation. If you installed HAWQ from a product distribution, the HAWQ root is typically `/usr/local/hawq`. If you built HAWQ from source or downloaded the tarball, you will have selected an install root directory on your own. + +Perform the following steps to set up your HAWQ operating environment: + +1. Log in to the HAWQ node as the desired user. For example: + + ``` shell + $ ssh gpadmin@<master> + gpadmin@master$ + ``` + + Or, if you are already logged in to \<node\-type\> as a different user, switch to the desired user. For example: + + ``` shell + gpadmin@master$ su - <hawq-user> + Password: + hawq-user@master$ + ``` + +2. Set up your HAWQ operating environment by sourcing the `greenplum_path.sh` file: + + ``` shell + hawq-node$ source /usr/local/hawq/greenplum_path.sh + ``` + + If you built HAWQ from source or downloaded the tarball, substitute the path to the installed or extracted `greenplum_path.sh` file \(for example `/opt/hawq-2.1.0.0/greenplum_path.sh`\). + + +3. Edit your `.bash_profile` or other shell initialization file to source `greenplum_path.sh` on login. For example, add: + + ``` shell + source /usr/local/hawq/greenplum_path.sh + ``` + +4. Set HAWQ-specific environment variables relevant to your deployment in your shell initialization file. These include `PGAPPNAME`, `PGDATABASE`, `PGHOST`, `PGPORT`, and `PGUSER.` For example: + + 1. If you use a custom HAWQ master port number, make this port number the default by setting the `PGPORT` environment variable in your shell initialization file; add: + + ``` shell + export PGPORT=10432 + ``` + + Setting `PGPORT` simplifies `psql` invocation by providing a default for the `-p` (port) option. + + 1. If you will routinely operate on a specific database, make this database the default by setting the `PGDATABASE` environment variable in your shell initialization file: + + ``` shell + export PGDATABASE=<database-name> + ``` + + Setting `PGDATABASE` simplifies `psql` invocation by providing a default for the `-d` (database) option. + + You may choose to set additional HAWQ deployment-specific environment variables. See [Environment Variables](../reference/HAWQEnvironmentVariables.html#optionalenvironmentvariables). + +## <a id="hawq_env_files_and_dirs"></a>HAWQ Files and Directories + +The following table identifies some files and directories of interest in a default HAWQ installation. Unless otherwise specified, the table entries are relative to `$GPHOME`. + +|File/Directory | Contents | +|---------------------------------|---------------------| +| $HOME/hawqAdminLogs/ | Default HAWQ management utility log file directory | +| greenplum_path.sh | HAWQ environment set-up script | +| bin/ | HAWQ admin, client, database, and administration utilities | +| etc/ | HAWQ configuration files, including `hawq-site.xml` | +| include/ | HDFS, PostgreSQL, `libpq` header files | +| lib/ | HAWQ libraries | +| lib/postgresql/ | PostgreSQL shared libraries and JAR files | +| share/postgresql/ | PostgreSQL and procedural languages samples and scripts | +| /data/hawq/[master|segment]/ | Default location of HAWQ master and segment data directories | +| /data/hawq/[master|segment]/pg_log/ | Default location of HAWQ master and segment log file directories | +| /etc/pxf/conf/ | PXF service and configuration files | +| /usr/lib/pxf/ | PXF service and plug-in shared libraries | +| /usr/hdp/current/ | HDP runtime and configuration files |
