DDL" by Ni ng Zhang

Apache Wiki Fri, 09 Oct 2009 16:49:21 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hive/LanguageManual/DDL" page has been changed by Ning Zhang:
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL?action=diff&rev1=18&rev2=19

  
  Table names and column names are case insensitive but SerDe and property 
names are case sensitive.
  
- Tables can also be created and populated by the results of a query in one 
CTAS (create-table-as-select) statement. There are two parts in CTAS, the 
select part can be any select statement supported by HiveQL. The create part of 
the CTAS takes the result schema (column names are aliases in the select clause 
and data types are derived from the select expressions) from the select part 
and create the target table with other properties (such as SerDe and storage 
format). The only restrictions in the CTAS is that the target table cannot be a 
partitioned table nor an external table. In addition, the table created by CTAS 
is atomic, meaning that the table is not seen by other users until all the 
result of the SELECT part is finished and populated. So other users will either 
see the table with the total results or will not see the table at all.
+ Tables can also be created and populated by the results of a query in one 
CTAS (create-table-as-select) statement. The table created by CTAS is atomic, 
meaning that the table is not seen by other users until all the query results 
are populated. So other users will either see the table with the complete 
results of the query or will not see the table at all.
+ 
+ There are two parts in CTAS, the SELECT part can be any 
[[Hive/LanguageManual/Select|SELECT statement]] supported by HiveQL. The CREATE 
part of the CTAS takes the resulting schema from the SELECT part and create the 
target table with other table properties such as the SerDe and storage format. 
The only restrictions in CTAS is that the target table cannot be a partitioned 
table nor an external table. 
  
  Examples:
  
@@ -132, +134 @@

  SORT BY new_key, key_value_pair;
  }}}
  
- The above CTAS statement create the target table new_key_value_store with the 
schema derived from the results of the SELECT statement. So the schema of the 
table new_key_value_store will be (new_key DOUBLE, key_value_pair STRING). In 
addition, the new target table is using a specific SerDe and a storage format 
independent of the source tables in the SELECT statement. 
+ The above CTAS statement creates the target table new_key_value_store with 
the schema, (new_key DOUBLE, key_value_pair STRING), derived from the results 
of the SELECT statement. If the SELECT statement does not specify column 
aliases, the column names will be automatically assigned to _col0, _col1, and 
_col2 etc. In addition, the new target table is created using a specific SerDe 
and a storage format independent of the source tables in the SELECT statement. 
  
  ==== Inserting Data Into Bucketed Tables ====
  The CLUSTER BY and SORTED BY creation commands do not effect how data is 
inserted into a table -- only how it is read.  This means that users must 
actively insert data correctly by specifying the number of reducers to be equal 
to the number of buckets, and using CLUSTER BY and SORT BY commands in their 
query.

[Hadoop Wiki] Update of "Hive/LanguageManual/DDL" by Ni ng Zhang

Reply via email to