Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by RaghothamMurthy: http://wiki.apache.org/hadoop/Hive/LanguageManual/Select ------------------------------------------------------------------------------ == Select Syntax == {{{ - SELECT [DISTINCT] select_expr, select_expr, ... + SELECT [ALL | DISTINCT] select_expr, select_expr, ... FROM table_reference [WHERE where_condition] [GROUP BY col_list] @@ -17, +17 @@ {{{ SELECT * FROM t1 }}} + * Where clause - The where condition is a [wiki:Self:Hive/LanguageManual/Types boolean] [wiki:Self:Hive/LanguageManual/Expressions expression]. For example, the following query returns only those sales records which have an amount greater than 10 from the US region. Hive does not support IN, EXISTS or subqueries in the WHERE clause. {{{ SELECT * FROM sales WHERE amount > 10 AND region = "US" + }}} + + * The ALL and DISTINCT options specify whether duplicate rows should be returned. If none of these options are given, the default is ALL (all matching rows are returned). DISTINCT specifies removal of duplicate rows from the result set. + {{{ + hive> SELECT col1, col2 FROM t1 + 1 3 + 1 3 + 1 4 + 2 5 + hive> SELECT DISTINCT col1, col2 FROM t1 + 1 3 + 1 4 + 2 5 + hive> SELECT DISTINCT col1 FROM t1 + 1 + 2 }}} * Partition based queries. In general, a SELECT query scans the entire table (other than for [wiki:Self:Hive/LanguageManual/Sampling sampling]). If a table created using the [wiki:Self:Hive/LanguageManual/DDL PARTITIONED BY] clause, a query can do '''input pruning''' and scan only a fraction of the table relevant to the query. For example, if table page_views is partitioned on column date, the following query retrieves rows for just one day 2008-03-31.
