On Wed, Dec 15, 2010 at 4:52 PM, Mark <[email protected]> wrote:
> Can someone explain what partitioning is and why it would be used.. example?
> Thanks
>

A partition is a physical and logical partition of the data. The query
planner can use partitions in the WHERE clause to prune data that hive
does not need to process.

For example, if you partition your table by day, you can write queries
such as SELECT count(1) FROM table where day=20100101. Hive will only
use the single partition as input, rather then the entire table.

Generally, you do not want to have to many partitions small partitions
or too few.

http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Add_Partitions

Reply via email to