Prasanth J created HIVE-6883:
--------------------------------

             Summary: Dynamic partitioning optimization does not honor sort 
order or order by
                 Key: HIVE-6883
                 URL: https://issues.apache.org/jira/browse/HIVE-6883
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.13.0
            Reporter: Prasanth J
            Assignee: Prasanth J
            Priority: Critical


HIVE-6455 patch does not honor sort order of the output table or order by of 
select statement. The reason for the former is numDistributionKey in 
ReduceSinkDesc is set wrongly. It doesn't take into account the sort columns, 
because of this RSOp sets the sort columns to null in Key. Since nulls are set 
in place of sort columns in Key, the sort columns in Value are not sorted. 

The other issue is ORDER BY columns are not honored during insertion. For 
example
{code}
insert overwrite table over1k_part_orc partition(ds="foo", t) select si,i,b,f,t 
from over1k_orc where t is null or t=27 order by si;
{code}

the select query performs order by on column 'si' in the first MR job. The 
following MR job (inserted by HIVE-6455), sorts the input data on dynamic 
partition column 't' without taking into account the already sorted 'si' 
column. This results in out of order insertion for 'si' column.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to