Zoltan Chovan created HIVE-17334: ------------------------------------ Summary: Problem in interpreting structure columns in view. Key: HIVE-17334 URL: https://issues.apache.org/jira/browse/HIVE-17334 Project: Hive Issue Type: Improvement Reporter: Zoltan Chovan Priority: Minor
Reproduction steps, to set up the env: {code:java} drop view if exists test_db.view_a; drop view if exists test_db.view_b; drop table if exists test_db.table_a; create database if not exists test_db; create external table if not exists test_db.table_a ( `id` bigint, `users` array<struct<user_id:int,counter:int>> ) partitioned by (p_date int) stored as parquet location '/user/hive/'; {code} With this, he following scenario will fail: {code:java} select t.p_date, t.id, p.user_id, sum(p.counter) as counter, sum(sum(p.counter)) over (partition by t.id) as total_actions from test_db.table_a as t lateral view explode(t.users) users_two as p group by t.p_date, t.id, p.user_id; create view if not exists test_db.view_a as select t.p_date, t.id, p.user_id, sum(p.counter) as counter, sum(sum(p.counter)) over (partition by t.id) as total_actions from test_db.table_a as t lateral view explode(t.users) users_two as p group by t.p_date, t.id, p.user_id; select * from test_db.view_a where p_date = 20170711; {code} The following scenario will succeed: {code:java} create view if not exists test_db.view_b as select base.*, sum(base.counter) over (partition by base.id) as total_actions from ( select t.p_date, t.id, p.user_id, sum(p.counter) as counter from test_db.table_a as t lateral view explode(t.users) users_two as p group by t.p_date, t.id, p.user_id ) as base; select * from test_db.view_b where p_date = 20170711; {code} As you can see the only difference is that the addition of "sum(sum(p.counter)) over (partition by t.id) as total_actions" If the view is created as follows then the query that was breaking works. {code:java} create view if not exists test_db.view_c as select dt.p_date, dt.id, users_two.p.user_id, sum(p.counter) as counter, sum(sum(p.counter)) over (partition by dt.id) as total_actions from test_db.table_a as dt lateral view explode(dt.users) users_two as p group by dt.p_date, dt.id, users_two.p.user_id; select * from test_db.view_c where p_date = 20170711; {code} The workaround is to use the table name along with the column name e.g users_two.p.user_id. Please advise if this is a bug and if it could be fixed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)