[
https://issues.apache.org/jira/browse/HIVE-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pengcheng Xiong updated HIVE-9228:
----------------------------------
Fix Version/s: (was: 1.2.0)
1.0.2
> Problem with subquery using windowing functions
> -----------------------------------------------
>
> Key: HIVE-9228
> URL: https://issues.apache.org/jira/browse/HIVE-9228
> Project: Hive
> Issue Type: Bug
> Components: PTF-Windowing
> Affects Versions: 0.14.0, 0.13.1, 1.0.0
> Reporter: Aihua Xu
> Assignee: Navis
> Fix For: 1.0.2
>
> Attachments: HIVE-9228.1.patch.txt, HIVE-9228.2.patch.txt,
> HIVE-9228.3.patch.txt, create_table_tab1.sql, tab1.csv
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> The following query with window functions failed. The internal query works
> fine.
> select col1, col2, col3 from (select col1,col2, col3, count(case when col4=1
> then 1 end ) over (partition by col1, col2) as col5, row_number() over
> (partition by col1, col2 order by col4) as col6 from tab1) t;
> HIVE generates an execution plan with 2 jobs.
> 1. The first job is to basically calculate window function for col5.
> 2. The second job is to calculate window function for col6 and output.
> The plan says the first job outputs the columns (col1, col2, col3, col4) to a
> tmp file since only these columns are used in later stage. While, the PTF
> operator for the first job outputs (_wcol0, col1, col2, col3, col4) with
> _wcol0 as the result of the window function even it's not used.
> In the second job, the map operator still reads the 4 columns (col1, col2,
> col3, col4) from the temp file using the plan. That causes the exception.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)