Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hive/PartitionedViews" page has been changed by JohnSichi. http://wiki.apache.org/hadoop/Hive/PartitionedViews?action=diff&rev1=1&rev2=2 -------------------------------------------------- = Use Cases = - * An administrator wants to create a set of views as a table/column renaming layer on top of an existing set of base tables, without disturbing the ETL processes which load those tables. To read-only users, the views should behave exactly the same as the underlying tables in every way. Among other things, this means users should be able to browse available partitions. + 1. An administrator wants to create a set of views as a table/column renaming layer on top of an existing set of base tables, without disturbing the ETL processes which load those tables. To read-only users, the views should behave exactly the same as the underlying tables in every way. Among other things, this means users should be able to browse available partitions. - * A base table is partitioned on columns (ds,hr) for date and hour. Besides this fine-grained partitioning, users would also like to see a virtual table of coarse-grained (date-only) partitioning in which the partition for a given date only appears once all of the hour-level partitions of that day have been fully loaded. + 1. A base table is partitioned on columns (ds,hr) for date and hour. Besides this fine-grained partitioning, users would also like to see a virtual table of coarse-grained (date-only) partitioning in which the partition for a given date only appears after all of the hour-level partitions of that day have been fully loaded. - * A view is defined on a complex join+union+aggregation of a number of underlying base tables and other views, all of which are themselves partitioned. The top-level view should also be partitioned accordingly, with a new partition not appearing until corresponding partitions have been loaded for all of the underlying tables. + 1. A view is defined on a complex join+union+aggregation of a number of underlying base tables and other views, all of which are themselves partitioned. The top-level view should also be partitioned accordingly, with a new partition not appearing until corresponding partitions have been loaded for all of the underlying tables. + = Approaches = + + 1. One possible approach mentioned in [[https://issues.apache.org/jira/browse/HIVE-1079|HIVE-1079]] is to infer view partitions automatically based on the partitions of the underlying tables. A command such as SHOW PARTITIONS could then synthesize virtual partition descriptors on the fly. This is fairly easy to do for use case #1, but potentially very difficult for use cases #2 and #3. So for now, we are punting on this approach. + 1. Instead, we will require users to explicitly declare view partitioning as part of CREATE VIEW, and explicitly manage partition metadata via ALTER VIEW {ADD|DROP} PARTITION. This allows all of the use cases to be satisfied (while placing more burden on the user, and taking up more metastore space). +
