Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by RodrigoSchmidt: http://wiki.apache.org/hadoop/Hive/GettingStarted ------------------------------------------------------------------------------ * $ ant package * $ cd build/dist * $ ls + * README.txt - o README.txt bin conf examples lib - - * bin/ (all the shell scripts) + * bin/ (all the shell scripts) - * lib/ (required jar files) + * lib/ (required jar files) - * conf/ (configuration files) + * conf/ (configuration files) - * examples/ (sample input and query files) + * examples/ (sample input and query files) + In the rest of the page, we use build/dist and <install-dir> interchangeably. [wiki:/EclipseSetup Instructions] to setup eclipse for hive development. @@ -43, +43 @@ To use hive command line interface (cli) from the shell: - ''$ bin/hive'' + * $ bin/hive == Using Hive == === Configuration management overview === - - hive configuration is stored in <install-dir>/conf/hive-default.xml + - hive default configuration is stored in <install-dir>/conf/hive-default.xml - and log4j in hive-log4j.properties - - hive configuration is an overlay on top of hadoop - meaning the - hadoop configuration variables are inherited by default. + Configuration variables can be changed by (re-)defining them in <install-dir>/conf/hive-site.xml + - log4j configuration is stored in <install-dir>/conf/hive-log4j.properties + + - hive configuration is an overlay on top of hadoop - meaning the hadoop configuration variables are inherited by default. + - hive configuration can be manipulated by: + * editing hive-site.xml and defining any desired variables (including hadoop variables) in it - o editing hive-default.xml and defining any desired variables - (including hadoop variables) in it - o from the cli using the set command (see below) + * from the cli using the set command (see below) - - o by invoking hive using the syntax: + * by invoking hive using the syntax: - * bin/hive -hiveconf x1=y1 -hiveconf x2=y2 + * $ bin/hive -hiveconf x1=y1 -hiveconf x2=y2 - this sets the variables x1 and x2 to y1 and y2 + this sets the variables x1 and x2 to y1 and y2 respectively === Error Logs === Hive uses log4j for logging. By default logs are not emitted to the @@ -83, +83 @@ hive> CREATE TABLE pokes (foo INT, bar STRING); - Creates a table called pokes with two columns, first being an + Creates a table called pokes with two columns, the first being an integer and the other a string - integer and other a string columns hive> CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING); - Creates a table called pokes with two columns and a partition column + Creates a table called invites with two columns and a partition column - called ds. The partition column is a virtual column it is not part + called ds. The partition column is a virtual column. It is not part of the data itself but is derived from the partition that a particular dataset is loaded into. - By default tables are assumed to be of text input format and the + By default, tables are assumed to be of text input format and the - delimiters are assumed to be ^A(ctrl-a). We will be soon publish additional + delimiters are assumed to be ^A(ctrl-a). We will soon publish additional - commands/recipes to add binary (sequencefiles) data and configurable + commands/recipes to add binary data (sequence files) and configurable delimiters etc. hive> SHOW TABLES; @@ -113, +112 @@ shows the list of columns - Altering tables. Table name can be changed and additional columns can be dropped + As for altering tables, table names can be changed and additional columns can be dropped: hive> ALTER TABLE pokes ADD COLUMNS (new_col INT); @@ -121, +120 @@ hive> ALTER TABLE events RENAME TO 3koobecaf; - Dropping tables + Dropping tables: + hive> DROP TABLE pokes;
