[
https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927957#comment-16927957
]
Vineet Garg commented on HIVE-20113:
------------------------------------
[~jcamachorodriguez] Gopal should have more details on this but reading the
comment it looks like for ONE_TO_ONE_EDGE to work with sorted cases it requires
shuffle writer to never split output across multiple files. I am not sure if
there is a follow-up JIRA.
> Shuffle avoidance: Disable 1-1 edges for sorted shuffle
> --------------------------------------------------------
>
> Key: HIVE-20113
> URL: https://issues.apache.org/jira/browse/HIVE-20113
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Reporter: Gopal V
> Assignee: Gopal V
> Priority: Major
> Labels: Branch3Candidate
> Attachments: HIVE-20113.1.patch, HIVE-20113.2.patch,
> HIVE-20113.3.patch, HIVE-20113.4.patch, HIVE-20113.4.patch,
> HIVE-20113.5.patch, HIVE-20113.6.patch, HIVE-20113.7.patch, HIVE-20113.8.patch
>
>
> The sorted shuffle avoidance can have some issues when the shuffle data gets
> broken up into multiple chunks on disk.
> The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to
> have a final merge at all, it should open a single compressed file and write
> a single index entry.
> Until the shuffle issue is resolved & a lot more testing, it is prudent to
> disable the optimization for sorted shuffle edges and stop rewriting the
> RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD).
--
This message was sent by Atlassian Jira
(v8.3.2#803003)