[ https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993783#comment-12993783 ]
He Yongqiang commented on HIVE-1941: ------------------------------------ How about the question raised in our internal review: the relation between the view's partition key and the underlying tables' partition keys? For example: Is there a use case to support a view def like: CREATE VIEW vp1 PARTITIONED ON (value) AS SELECT key, value FROM src WHERE key=86; Because the view is partitioned on value, which is not the partition key of the underlying table. So the Partition on the view seems not very useful. But if in an example: CREATE VIEW vp2 PARTITIONED ON (ds) AS SELECT src.key, srcpart.value, srcpart.ds FROM src join srcpart on src.key = srcpart.key ; The vp2's ds originate from srcpart. So if there is a partition ds='2011' existing in srcpart, it makes sense to add a partition ds='2011' in vp2. Another example: create table srcpart_1 (key int, value string) partitioned by (ds string); create table srcpart_2 (key int, value string) partitioned by (ds string, hr int); CREATE VIEW vp2 PARTITIONED ON (ds, hr) AS SELECT srcpart_1.key, srcpart_2.value, srcpart_1.ds, srcpart_2.hr FROM srcpart_1 join srcpart_2 on srcpart_1.key = srcpart_2.key and srcpart_1.ds=srcpart_2.ds ; >From this case the ds originate from srcpart_1, but the join condition put the >same ds on srcpart_2. So a query like "select * from vp2 where ds='2000' and >hr=11" is very smooth to run. Some negative examples: create table srcpart_1 (key int, value string) partitioned by (ds string); create table srcpart_2 (key int, value string) partitioned by (ds string, hr int); CREATE VIEW vp2 PARTITIONED ON (ds, hr) AS SELECT srcpart_1.key, srcpart_2.value, srcpart_1.ds, srcpart_2.hr FROM srcpart_1 join srcpart_2 on srcpart_1.key = srcpart_2.key ; We should not support this because it is hard to a partition pruning. Another one: CREATE VIEW vp2 PARTITIONED ON (ds, hr) AS SELECT srcpart_1.key, srcpart_2.value, srcpart_1.ds, srcpart_2.hr FROM srcpart_1 join srcpart_2 on srcpart_1.key = srcpart_2.key and srcpart_1.ds=srcpart_2.hr ; we should probably also throw an error this definition. I am not sure. But i think it makes sense to include the view's partition keys are from the underlying tables. Even not, we should make sure a prefix of the partition keys are from underlying tables. And for partition keys from base tables, we should also make sure they cover a prefix of the base tables' partition keys (not randomly chosen). A big problem of this implicit partition relationship between a partitioned view and the base tables is how to manage the relationships to keep consistent. For example, what if the base table's partition got dropped and the view's partition is still there? I think we should be fine here. > support explicit view partitioning > ---------------------------------- > > Key: HIVE-1941 > URL: https://issues.apache.org/jira/browse/HIVE-1941 > Project: Hive > Issue Type: New Feature > Components: Query Processor > Affects Versions: 0.6.0 > Reporter: John Sichi > Assignee: John Sichi > Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch, > HIVE-1941.4.patch > > > Allow creation of a view with an explicit partitioning definition, and > support ALTER VIEW ADD/DROP PARTITION for instantiating partitions. > For more information, see > http://wiki.apache.org/hadoop/Hive/PartitionedViews -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira