[Hadoop Wiki] Update of "Hive/Tutorial" by VijendarGanta

Apache Wiki Wed, 21 Jan 2009 20:38:52 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by VijendarGanta:
http://wiki.apache.org/hadoop/Hive/Tutorial

------------------------------------------------------------------------------
  
  == Language capabilities ==
  Hive query language provides the basic SQL like operations. These operations 
work on tables or partitions. These operations are:
- * Ability to filter rows from a table using a where clause.
+     * Ability to filter rows from a table using a where clause.
- * Ability to select certain columns from the table using a select clause.
+     * Ability to select certain columns from the table using a select clause.
- * Ability to do equi-joins between two tables.
+     * Ability to do equi-joins between two tables.
- * Ability to evaluate aggregations on multiple "group by" columns for the 
data stored in a table.
+     * Ability to evaluate aggregations on multiple "group by" columns for the 
data stored in a table.
- * Ability to store the results of a query into another table.
+     * Ability to store the results of a query into another table.
- * Ability to download the contents of a table to a local (e.g., nfs) 
directory.
+     * Ability to download the contents of a table to a local (e.g., nfs) 
directory.
- * Ability to store the results of a query in a hadoop dfs directory.
+     * Ability to store the results of a query in a hadoop dfs directory.
- * Ability to manage tables and partitions (create, drop and alter).
+     * Ability to manage tables and partitions (create, drop and alter).  
- * Ability to plug in custom scripts in the language of choice for custom 
map/reduce jobs.
+     * Ability to plug in custom scripts in the language of choice for custom 
map/reduce jobs.
  
  = Usage and Examples =
  The following examples highlight some salient features of the system. A 
detailed set of query test cases can be found at 
[http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/queries/clientpositive/
 Hive Query Test Cases] and the corresponding results can be found at 
[http://svn.apache.org/viewvc/hadoop/hive/trunk/ql/src/test/results/clientpositive/
 Query Test Case Results]
@@ -183, +183 @@

  == Creating Tables ==
  An example statement that would create the page_view table mentioned above 
would be like:
  {{{     
- 
      CREATE TABLE page_view(viewTime INT, userid BIGINT,
                      page_url STRING, referrer_url STRING, 
                      ip STRING COMMENT 'IP Address of the User') 
@@ -212, +211 @@

  It is also a good idea to bucket the tables on certain columns so that 
efficient sampling queries can be executed against the data set (note: If 
bucketing is absent, random sampling can still be done on the table). The 
following example illustrates the case of the page_view table which is bucketed 
on userid column: 
  {{{     
      CREATE TABLE page_view(viewTime INT, userid BIGINT,
- 
                      page_url STRING, referrer_url STRING, 
                      ip STRING COMMENT 'IP Address of the User') 
      COMMENT 'This is the page view table'

[Hadoop Wiki] Update of "Hive/Tutorial" by VijendarGanta

Reply via email to