Repository: incubator-hawq-docs
Updated Branches:
  refs/heads/develop 86ef7009f -> 00a2a3684


make references DataNode consistent


Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/00a2a368
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/00a2a368
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/00a2a368

Branch: refs/heads/develop
Commit: 00a2a3684b9074a11f720c72be61fd1672d5aa1f
Parents: 86ef700
Author: Lisa Owen <[email protected]>
Authored: Thu Oct 20 10:59:58 2016 -0700
Committer: Lisa Owen <[email protected]>
Committed: Thu Oct 20 10:59:58 2016 -0700

----------------------------------------------------------------------
 ddl/ddl-table.html.md.erb                                 | 2 +-
 install/aws-config.html.md.erb                            | 2 +-
 install/select-hosts.html.md.erb                          | 4 ++--
 overview/TableDistributionStorage.html.md.erb             | 2 +-
 pxf/TroubleshootingPXF.html.md.erb                        | 4 ++--
 query/query-performance.html.md.erb                       | 2 +-
 reference/HDFSConfigurationParameterReference.html.md.erb | 6 +++---
 7 files changed, 11 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/ddl/ddl-table.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-table.html.md.erb b/ddl/ddl-table.html.md.erb
index 62ece36..d0220d7 100644
--- a/ddl/ddl-table.html.md.erb
+++ b/ddl/ddl-table.html.md.erb
@@ -66,7 +66,7 @@ Foreign key constraints specify that the values in a column 
or a group of column
 
 All HAWQ tables are distributed. The default is `DISTRIBUTED RANDOMLY` 
\(round-robin distribution\) to determine the table row distribution. However, 
when you create or alter a table, you can optionally specify `DISTRIBUTED BY` 
to distribute data according to a hash-based policy. In this case, the 
`bucketnum` attribute sets the number of hash buckets used by a 
hash-distributed table. Columns of geometric or user-defined data types are not 
eligible as HAWQ distribution key columns. 
 
-Randomly distributed tables have benefits over hash distributed tables. For 
example, after expansion, HAWQ's elasticity feature lets it automatically use 
more resources without needing to redistribute the data. For extremely large 
tables, redistribution is very expensive. Also, data locality for randomly 
distributed tables is better, especially after the underlying HDFS 
redistributes its data during rebalancing or because of data node failures. 
This is quite common when the cluster is large.
+Randomly distributed tables have benefits over hash distributed tables. For 
example, after expansion, HAWQ's elasticity feature lets it automatically use 
more resources without needing to redistribute the data. For extremely large 
tables, redistribution is very expensive. Also, data locality for randomly 
distributed tables is better, especially after the underlying HDFS 
redistributes its data during rebalancing or because of DataNode failures. This 
is quite common when the cluster is large.
 
 However, hash distributed tables can be faster than randomly distributed 
tables. For example, for TPCH queries, where there are several queries, HASH 
distributed tables can have performance benefits. Choose a distribution policy 
that best suits your application scenario. When you `CREATE TABLE`, you can 
also specify the `bucketnum` option. The `bucketnum` determines the number of 
hash buckets used in creating a hash-distributed table or for PXF external 
table intermediate processing. The number of buckets also affects how many 
virtual segments will be created when processing this data. The bucketnumber of 
a gpfdist external table is the number of gpfdist location, and the 
bucketnumber of a command external table is `ON #num`. PXF external tables use 
the `default_hash_table_bucket_number` parameter to control virtual segments. 
 

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/install/aws-config.html.md.erb
----------------------------------------------------------------------
diff --git a/install/aws-config.html.md.erb b/install/aws-config.html.md.erb
index e4106b1..21cadf5 100644
--- a/install/aws-config.html.md.erb
+++ b/install/aws-config.html.md.erb
@@ -34,7 +34,7 @@ Virtual devices for instance store volumes for HAWQ EC2 
instance store instances
 
 A placement group is a logical grouping of instances within a single 
availability zone that together participate in a low-latency, 10 Gbps network.  
Your HAWQ master and segment cluster instances should support enhanced 
networking and reside in a single placement group (and subnet) for optimal 
network performance.  
 
-If your Ambari node is not a data node, locating the Ambari node instance in a 
subnet separate from the HAWQ master/segment placement group enables you to 
manage multiple HAWQ clusters from the single Ambari instance.
+If your Ambari node is not a DataNode, locating the Ambari node instance in a 
subnet separate from the HAWQ master/segment placement group enables you to 
manage multiple HAWQ clusters from the single Ambari instance.
 
 Amazon recommends that you use the same instance type for all instances in the 
placement group and that you launch all instances within the placement group at 
the same time.
 

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/install/select-hosts.html.md.erb
----------------------------------------------------------------------
diff --git a/install/select-hosts.html.md.erb b/install/select-hosts.html.md.erb
index c49f184..c2fbdff 100644
--- a/install/select-hosts.html.md.erb
+++ b/install/select-hosts.html.md.erb
@@ -8,10 +8,10 @@ Complete this procedure for all HAWQ deployments:
 
 1.  **Choose the host machines that will host a HAWQ segment.** Keep in mind 
these restrictions and requirements:
     -   Each host must meet the system requirements for the version of HAWQ 
you are installing.
-    -   Each HAWQ segment must be co-located on a host that runs an HDFS data 
node.
+    -   Each HAWQ segment must be co-located on a host that runs an HDFS 
DataNode.
     -   The HAWQ master segment and standby master segment must be hosted on 
separate machines.
 2.  **Choose the host machines that will run PXF.** Keep in mind these 
restrictions and requirements:
-    -   PXF must be installed on the HDFS NameNode *and* on all HDFS data 
nodes.
+    -   PXF must be installed on the HDFS NameNode *and* on all HDFS DataNodes.
     -   If you have configured Hadoop with high availability, PXF must also be 
installed on all HDFS nodes including all NameNode services.
     -   If you want to use PXF with HBase or Hive, you must first install the 
HBase client \(hbase-client\) and/or Hive client \(hive-client\) on each 
machine where you intend to install PXF. See the [HDP installation 
documentation](http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/index.html)
 for more information.
 3.  **Verify that required ports on all machines are unused.** By default, a 
HAWQ master or standby master service configuration uses port 5432. Hosts that 
run other PostgreSQL instances cannot be used to run a default HAWQ master or 
standby service configuration because the default PostgreSQL port \(5432\) 
conflicts with the default HAWQ port. You must either change the default port 
configuration of the running PostgreSQL instance or change the HAWQ master port 
setting during the HAWQ service installation to avoid port conflicts.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/overview/TableDistributionStorage.html.md.erb
----------------------------------------------------------------------
diff --git a/overview/TableDistributionStorage.html.md.erb 
b/overview/TableDistributionStorage.html.md.erb
index aa03b59..58f20f2 100755
--- a/overview/TableDistributionStorage.html.md.erb
+++ b/overview/TableDistributionStorage.html.md.erb
@@ -12,7 +12,7 @@ For all HAWQ table storage formats, AO \(Append-Only\) and 
Parquet, the data fil
 
 The default table distribution policy in HAWQ is random.
 
-Randomly distributed tables have some benefits over hash distributed tables. 
For example, after cluster expansion, HAWQ can use more resources automatically 
without redistributing the data. For huge tables, redistribution is very 
expensive, and data locality for randomly distributed tables is better after 
the underlying HDFS redistributes its data during rebalance or data node 
failures. This is quite common when the cluster is large.
+Randomly distributed tables have some benefits over hash distributed tables. 
For example, after cluster expansion, HAWQ can use more resources automatically 
without redistributing the data. For huge tables, redistribution is very 
expensive, and data locality for randomly distributed tables is better after 
the underlying HDFS redistributes its data during rebalance or DataNode 
failures. This is quite common when the cluster is large.
 
 On the other hand, for some queries, hash distributed tables are faster than 
randomly distributed tables. For example, hash distributed tables have some 
performance benefits for some TPC-H queries. You should choose the distribution 
policy that is best suited for your application's scenario.
 

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/pxf/TroubleshootingPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/pxf/TroubleshootingPXF.html.md.erb 
b/pxf/TroubleshootingPXF.html.md.erb
index 7b53065..d59e361 100644
--- a/pxf/TroubleshootingPXF.html.md.erb
+++ b/pxf/TroubleshootingPXF.html.md.erb
@@ -49,8 +49,8 @@ The following table lists some common errors encountered 
while using PXF:
 <td>Cannot find PXF Jar</td>
 </tr>
 <tr class="even">
-<td>ERROR:  PXF API encountered a HTTP 404 error. Either the PXF service 
(tomcat) on data node was not started or PXF webapp was not started.</td>
-<td>Either the required data node does not exist or PXF service (tcServer) on 
data node is not started or PXF webapp was not started</td>
+<td>ERROR:  PXF API encountered a HTTP 404 error. Either the PXF service 
(tomcat) on the DataNode was not started or the PXF webapp was not started.</td>
+<td>Either the required DataNode does not exist or PXF service (tcServer) on 
the DataNode is not started or PXF webapp was not started</td>
 </tr>
 <tr class="odd">
 <td>ERROR:  remote component error (500) from '&lt;x&gt;':  type  Exception 
report   message   java.lang.NoClassDefFoundError: 
org/apache/hadoop/hbase/client/HTableInterface</td>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/query/query-performance.html.md.erb
----------------------------------------------------------------------
diff --git a/query/query-performance.html.md.erb 
b/query/query-performance.html.md.erb
index 4515575..b4f88fe 100644
--- a/query/query-performance.html.md.erb
+++ b/query/query-performance.html.md.erb
@@ -99,7 +99,7 @@ The following table describes the metrics related to data 
locality. Use these me
 </tr>
 <tr class="odd">
 <td>continuity</td>
-<td>reading a HDFS file discontinuously will introduce additional seek, which 
will slow the table scan of a query. A low value of continuity indicates that 
the blocks of a file are not continuously distributed on a datanode.</td>
+<td>reading a HDFS file discontinuously will introduce additional seek, which 
will slow the table scan of a query. A low value of continuity indicates that 
the blocks of a file are not continuously distributed on a DataNode.</td>
 </tr>
 <tr class="even">
 <td>DFS metadatacache</td>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/00a2a368/reference/HDFSConfigurationParameterReference.html.md.erb
----------------------------------------------------------------------
diff --git a/reference/HDFSConfigurationParameterReference.html.md.erb 
b/reference/HDFSConfigurationParameterReference.html.md.erb
index 8199de2..aef4ed2 100644
--- a/reference/HDFSConfigurationParameterReference.html.md.erb
+++ b/reference/HDFSConfigurationParameterReference.html.md.erb
@@ -13,13 +13,13 @@ This table describes the configuration parameters and 
values that are recommende
 | Parameter                                 | Description                      
                                                                                
                                                                                
                  | Recommended Value for HAWQ Installs                         
          | Comments                                                            
                                                                                
                   |
 
|-------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | `dfs.allow.truncate`                      | Allows truncate.                 
                                                                                
                                                                                
                  | true                                                        
          | HAWQ requires that you enable `dfs.allow.truncate`. The HAWQ 
service will fail to start if `dfs.allow.truncate` is not set to `true`.        
                          |
-| `dfs.block.access.token.enable`           | If `true`, access tokens are 
used as capabilities for accessing datanodes. If `false`, no access tokens are 
checked on accessing datanodes.                                                 
                       | *false* for an unsecured HDFS cluster, or *true* for a 
secure cluster |                                                               
                                                                                
                         |
+| `dfs.block.access.token.enable`           | If `true`, access tokens are 
used as capabilities for accessing DataNodes. If `false`, no access tokens are 
checked on accessing DataNodes.                                                 
                       | *false* for an unsecured HDFS cluster, or *true* for a 
secure cluster |                                                               
                                                                                
                         |
 | `dfs.block.local-path-access.user`        | Comma separated list of the 
users allowed to open block files on legacy short-circuit local read.           
                                                                                
                       | gpadmin                                                
               |                                                               
                                                                                
                         |
 | `dfs.client.read.shortcircuit`            | This configuration parameter 
turns on short-circuit local reads.                                             
                                                                                
                      | true                                                    
              | In Ambari, this parameter corresponds to **HDFS Short-circuit 
read**. The value for this parameter should be the same in `hdfs-site.xml` and 
HAWQ's `hdfs-client.xml`. |
 | `dfs.client.socket-timeout`               | The amount of time before a 
client connection times out when establishing a connection or reading. The 
value is expressed in milliseconds.                                             
                            | 300000000                                         
                    |                                                          
                                                                                
                              |
 | `dfs.client.use.legacy.blockreader.local` | Setting this value to false 
specifies that the new version of the short-circuit reader is used. Setting 
this value to true means that the legacy short-circuit reader would be used.    
                           | false                                              
                   |                                                           
                                                                                
                             |
-| `dfs.datanode.data.dir.perm`              | Permissions for the directories 
on on the local filesystem where the DFS data node store its blocks. The 
permissions can either be octal or symbolic.                                    
                          | 750                                                 
                  | In Ambari, this parameter corresponds to **DataNode 
directories permission**                                                        
                                   |
-| `dfs.datanode.handler.count`              | The number of server threads for 
the datanode.                                                                   
                                                                                
                  | 60                                                          
          |                                                                    
                                                                                
                    |
+| `dfs.datanode.data.dir.perm`              | Permissions for the directories 
on on the local filesystem where the DFS DataNode stores its blocks. The 
permissions can either be octal or symbolic.                                    
                          | 750                                                 
                  | In Ambari, this parameter corresponds to **DataNode 
directories permission**                                                        
                                   |
+| `dfs.datanode.handler.count`              | The number of server threads for 
the DataNode.                                                                   
                                                                                
                  | 60                                                          
          |                                                                    
                                                                                
                    |
 | `dfs.datanode.max.transfer.threads`       | Specifies the maximum number of 
threads to use for transferring data in and out of the DataNode.                
                                                                                
                   | 40960                                                      
           | In Ambari, this parameter corresponds to **DataNode max data 
transfer threads**                                                              
                          |
 | `dfs.datanode.socket.write.timeout`       | The amount of time before a 
write operation times out, expressed in milliseconds.                           
                                                                                
                       | 7200000                                                
               |                                                               
                                                                                
                         |
 | `dfs.domain.socket.path`                  | (Optional.) The path to a UNIX 
domain socket to use for communication between the DataNode and local HDFS 
clients. If the string "\_PORT" is present in this path, it is replaced by the 
TCP port of the DataNode. |                                                    
                   | If set, the value for this parameter should be the same in 
`hdfs-site.xml` and HAWQ's `hdfs-client.xml`.                                   
                            |

Reply via email to