[ https://issues.apache.org/jira/browse/HIVE-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151227#comment-14151227 ]
Prasanth J commented on HIVE-8151: ---------------------------------- [~wzc1989] There are 2 issues with this optimization which was found recently. One is this issue (HIVE-8151) where the last record of a particular group by ends up in the next partition. Other one is HIVE-8162 where in case of group-by additional columns are added to reducer key which should have been done only for order-by queries. Does any of these issues solve your case? I am still not sure how ClassCastException is related to any of these cases. Would it be possible for you check if HIVE-8151 or HIVE-8162 solves your issue? Alternatively, you can provide me a small reproducible case which I can use to make sure this feature works properly. > Dynamic partition sort optimization inserts record wrongly to partition when > used with GroupBy > ---------------------------------------------------------------------------------------------- > > Key: HIVE-8151 > URL: https://issues.apache.org/jira/browse/HIVE-8151 > Project: Hive > Issue Type: Bug > Affects Versions: 0.14.0, 0.13.1 > Reporter: Prasanth J > Assignee: Prasanth J > Priority: Blocker > Attachments: HIVE-8151.1.patch, HIVE-8151.2.patch, HIVE-8151.3.patch > > > HIVE-6455 added dynamic partition sort optimization. It added startGroup() > method to FileSink operator to look for changes in reduce key for creating > partition directories. This method however is not reliable as the key called > with startGroup() is different from the key called with processOp(). > startGroup() is called with newly changed key whereas processOp() is called > with previously aggregated key. This will result in processOp() writing the > last row of previous group as the first row of next group. This happens only > when used with group by operator. > The fix is to not rely on startGroup() and do the partition directory > creation in processOp() itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)