Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/LanguageManual/Select" page has been changed by Ning Zhang. http://wiki.apache.org/hadoop/Hive/LanguageManual/Select?action=diff&rev1=10&rev2=11 -------------------------------------------------- }}} === Partition Based Queries === - In general, a SELECT query scans the entire table (other than for [[Hive/LanguageManual/Sampling|sampling]]). If a table created using the [[Hive/LanguageManual/DDL|PARTITIONED BY]] clause, a query can do '''input pruning''' and scan only a fraction of the table relevant to the query. Hive currently does input pruning only if the partition predicates are specified in the WHERE clause closest to the table_reference in the FROM clause. For example, if table page_views is partitioned on column date, the following query retrieves rows for just one day 2008-03-31. + In general, a SELECT query scans the entire table (other than for [[Hive/LanguageManual/Sampling|sampling]]). If a table created using the [[Hive/LanguageManual/DDL|PARTITIONED BY]] clause, a query can do '''partition pruning''' and scan only a fraction of the table relevant to the partitions specified by the query. Hive currently does partition pruning if the partition predicates are specified in the WHERE clause or the ON clause in a JOIN. For example, if table page_views is partitioned on column date, the following query retrieves rows for just days between 2008-03-01 and 2008-03-31. {{{ SELECT page_views.* FROM page_views WHERE page_views.date >= '2008-03-01' AND page_views.date <= '2008-03-31' }}} + + If a table page_views is joined with another table dim_users, you can specify a range of partitions in the ON clause as follows: + {{{ + SELECT page_views.* + FROM page_views JOIN dim_users + ON (page_views.user_id = dim_users.id AND page_views.date >= '2008-03-01' AND page_views.date <= '2008-03-31') + }}} + + * See also [[Hive/LanguageManual/GroupBy|Group By]]
