Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/LanguageManual/DDL" page has been changed by JohnSichi. http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL?action=diff&rev1=24&rev2=25 -------------------------------------------------- [LOCATION hdfs_path] [AS select_statement] (Note: this feature is only available on the latest trunk or versions higher than 0.4.0.) + CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name + LIKE existing_table_name + [LOCATION hdfs_path] + data_type : primitive_type | array_type @@ -46, +50 @@ }}} CREATE TABLE creates a table with given name. An error is thrown if a table with the same name exists. You can use IF NOT EXISTS to skip the error. - EXTERNAL keyword lets you create a table and provide a LOCATION so that Hive does not use default location for this table. This comes in handy if you already have data generated. When dropping an EXTERNAL table, data in the table is NOT deleted from the file system. + The EXTERNAL keyword lets you create a table and provide a LOCATION so that Hive does not use default location for this table. This comes in handy if you already have data generated. When dropping an EXTERNAL table, data in the table is NOT deleted from the file system. + + The LIKE form of CREATE TABLE allows you to copy an existing table definition exactly (without copying its data). You can create tables with custom SerDe or using native SerDe. A native SerDe is used if ROW FORMAT is not specified or ROW FORMAT DELIMITED is specified. You can use DELIMITED clause to read delimited files. Use SERDE clause to create a table with custom SerDe. Refer to SerDe section of User Guide for more information on SerDe. @@ -74, +80 @@ PARTITIONED BY(dt STRING, country STRING) STORED AS SEQUENCEFILE; }}} - The statement above creates page_view table with viewTime, userid, page_url, referrer_url, up columns with a comment. The table is also partitioned and data is stored in sequence files. The data in the files assumed to be field delimited by ctrl-A and row delimited by newline. + The statement above creates page_view table with viewTime, userid, page_url, referrer_url, up columns with a comment. The table is also partitioned and data is stored in sequence files. The data format in the files is assumed to be field-delimited by ctrl-A and row-delimited by newline. {{{ CREATE TABLE page_view(viewTime INT, userid BIGINT, @@ -127, +133 @@ }}} The above CTAS statement creates the target table new_key_value_store with the schema, (new_key DOUBLE, key_value_pair STRING), derived from the results of the SELECT statement. If the SELECT statement does not specify column aliases, the column names will be automatically assigned to _col0, _col1, and _col2 etc. In addition, the new target table is created using a specific SerDe and a storage format independent of the source tables in the SELECT statement. + {{{ + CREATE TABLE empty_key_value_store + LIKE key_value_store + }}} + + In contrast, the statement above creates a new empty_key_value_store table whose definition exactly matches the existing key_value_store in all particulars other than table name. The new table contains no rows. + ==== Inserting Data Into Bucketed Tables ==== - The CLUSTER BY and SORTED BY creation commands do not effect how data is inserted into a table -- only how it is read. This means that users must actively insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query. + The CLUSTER BY and SORTED BY creation commands do not affect how data is inserted into a table -- only how it is read. This means that users must actively insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query. There is also an [[Hive/LanguageManual/DDL/BucketedTables|example of creating and populating bucketed tables]].
