Github user dyozie commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq-docs/pull/60#discussion_r87922081
  
    --- Diff: ddl/ddl-table.html.md.erb ---
    @@ -93,14 +93,14 @@ For any specific query, the first four factors are 
fixed values, while the confi
     
     The `bucketnum` for a hash table specifies the number of hash buckets to 
be used in creating virtual segments. A HASH distributed table is created with 
`default_hash_table_bucket_number` buckets. The default bucket value can be 
changed in session level or in the `CREATE TABLE` DDL by using the `bucketnum` 
storage parameter.
     
    -When initializing a cluster, you can use the `hawq init --bucket_number` 
parameter to explcitly set the default bucket number 
\(`default_hash_table_bucket_number`\).
    +In an Ambari-managed HAWQ cluster, the default bucket number 
\(`default_hash_table_bucket_number`\) is derived from the number of segment 
nodes. In command-line-managed HAWQ environments, you can use the 
`--bucket_number` option of `hawq init` to explicitly set 
`default_hash_table_bucket_number` during cluster initialization.
     
    -**Note:** For best performance with large tables, the number of buckets 
should not exceed the value of the `default_hash_table_bucket_number` 
parameter. Small tables can use one segment node, `with bucketnum=1`. For 
larger tables, the bucketnum is set to a multiple of the number of segment 
nodes, for the best load balancing on different segment nodes. The elastic 
runtime will attempt to find the optimal number of buckets for the number of 
nodes being processed. Larger tables need more virtual segments , and hence use 
larger numbers of buckets.
    +**Note:** For best performance with large tables, the number of buckets 
should not exceed the value of the `default_hash_table_bucket_number` 
parameter. Small tables can use one segment node, `WITH bucketnum=1`. For 
larger tables, the `bucketnum` is set to a multiple of the number of segment 
nodes, for the best load balancing on different segment nodes. The elastic 
runtime will attempt to find the optimal number of buckets for the number of 
nodes being processed. Larger tables need more virtual segments , and hence use 
larger numbers of buckets.
    --- End diff --
    
    Might as well fix (remove) the space before the comma here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to