[
https://issues.apache.org/jira/browse/HIVE-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151227#comment-14151227
]
Prasanth J commented on HIVE-8151:
----------------------------------
[~wzc1989] There are 2 issues with this optimization which was found recently.
One is this issue (HIVE-8151) where the last record of a particular group by
ends up in the next partition. Other one is HIVE-8162 where in case of group-by
additional columns are added to reducer key which should have been done only
for order-by queries. Does any of these issues solve your case? I am still not
sure how ClassCastException is related to any of these cases. Would it be
possible for you check if HIVE-8151 or HIVE-8162 solves your issue?
Alternatively, you can provide me a small reproducible case which I can use to
make sure this feature works properly.
> Dynamic partition sort optimization inserts record wrongly to partition when
> used with GroupBy
> ----------------------------------------------------------------------------------------------
>
> Key: HIVE-8151
> URL: https://issues.apache.org/jira/browse/HIVE-8151
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.14.0, 0.13.1
> Reporter: Prasanth J
> Assignee: Prasanth J
> Priority: Blocker
> Attachments: HIVE-8151.1.patch, HIVE-8151.2.patch, HIVE-8151.3.patch
>
>
> HIVE-6455 added dynamic partition sort optimization. It added startGroup()
> method to FileSink operator to look for changes in reduce key for creating
> partition directories. This method however is not reliable as the key called
> with startGroup() is different from the key called with processOp().
> startGroup() is called with newly changed key whereas processOp() is called
> with previously aggregated key. This will result in processOp() writing the
> last row of previous group as the first row of next group. This happens only
> when used with group by operator.
> The fix is to not rely on startGroup() and do the partition directory
> creation in processOp() itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)