[ https://issues.apache.org/jira/browse/HIVE-5358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779645#comment-13779645 ]
Chun Chen commented on HIVE-5358: --------------------------------- Sorry for the misunderstand the intention of checkExprs in ReduceSinkDeDuplication. [~ashutoshc] I will try to preserve the order of key Columns on RS in those test cases. {code} select c3, c2 from (select c1, c2, c3 from t1 order by c1, c2, c3) t group by c3, c2; {code} [~yhuai] I don't understand what you mean about the above sql. If we use [c3, c2] as key columns, what's the problem of that? > ReduceSinkDeDuplication should ignore column orders when check overlapping > part of keys between parent and child > ---------------------------------------------------------------------------------------------------------------- > > Key: HIVE-5358 > URL: https://issues.apache.org/jira/browse/HIVE-5358 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Chun Chen > Assignee: Chun Chen > Attachments: D13113.1.patch, HIVE-5358.2.patch, HIVE-5358.patch > > > {code} > select key, value from (select key, value from src group by key, value) t > group by key, value; > {code} > This can be optimized by ReduceSinkDeDuplication > {code} > select key, value from (select key, value from src group by key, value) t > group by value, key; > {code} > However the sql above can't be optimized by ReduceSinkDeDuplication currently > due to different column orders of parent and child operator. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira