Harish Butani created HIVE-10585: ------------------------------------ Summary: Range based Windowing is handled incorrectly for String types Key: HIVE-10585 URL: https://issues.apache.org/jira/browse/HIVE-10585 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani
Thanks to [~yhuai] for pointing this out. I think the thought for ordinal datatypes (like string) was to measure distance as the number of changed values. So 2 preceding would mean go back until you have reach the 2nd different value from the value in the 'current' row. But this is not the way it is implemented. StringValueBoundaryScanner simply ignores the preceding amount. Here is an example from windowing.q that is not handled correctly {noformat} -- 31. testWindowCrossReference select p_mfgr, p_name, p_size, sum(p_size) over w1 as s1, sum(p_size) over w2 as s2 from part window w1 as (partition by p_mfgr order by p_name range between 2 preceding and 2 following), w2 as w1; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)