[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902198#comment-17902198
 ] 

Stamatis Zampetakis commented on HIVE-26986:
--------------------------------------------

Thanks for the explanations Seonggon and Sungwoo. I was trying to understand 
the urgency and impact of this ticket to prioritize it among others. The 
seemingly "redundant" reduce sink operators are not causing wrong results so it 
seems that this is mainly a perf improvement. I will coordinate with some 
people working from the compiler team to review this and push it forward.

Other than that a more succinct summary such as "ParallelEdgeFixer adds 
redundant reduce sink operators" seems sufficient to describe the problem. 
Abbreviations (e.g., SWO, PEF) and lots of tech details (OperatorGraph, Tez 
DAGs, etc.) can in some cases be misleading especially for end-users 
(non-developers).

> SWO and PEF make wrong decisions (e.g. by inserting unnecessary RSs) due to 
> inconsistency between DAGs produced by OperatorGraph and Tez DAGs.
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-26986
>                 URL: https://issues.apache.org/jira/browse/HIVE-26986
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Major
>              Labels: hive-4.1.0-must, pull-request-available
>         Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to