[
https://issues.apache.org/jira/browse/HIVE-26968?focusedWorklogId=853567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853567
]
ASF GitHub Bot logged work on HIVE-26968:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Mar/23 06:24
Start Date: 29/Mar/23 06:24
Worklog Time Spent: 10m
Work Description: ngsg commented on PR #3981:
URL: https://github.com/apache/hive/pull/3981#issuecomment-1488012344
Hello @zabetak. I have added a new qfile, which validates my PR. In a
nutshell, this qfile submits the same query twice while varying the value of
hive.optimize.shared.work.dppunion. I checked that current Hive produces
different results as I described in the JIRA issue
(https://issues.apache.org/jira/browse/HIVE-26968). Could you please review the
changes? Thank you.
Issue Time Tracking
-------------------
Worklog Id: (was: 853567)
Time Spent: 40m (was: 0.5h)
> SharedWorkOptimizer merges TableScan operators that have different DPP parents
> ------------------------------------------------------------------------------
>
> Key: HIVE-26968
> URL: https://issues.apache.org/jira/browse/HIVE-26968
> Project: Hive
> Issue Type: Sub-task
> Affects Versions: 4.0.0-alpha-2
> Reporter: Seonggon Namgung
> Assignee: Seonggon Namgung
> Priority: Critical
> Labels: hive-4.0.0-must, pull-request-available
> Attachments: TPC-DS Query64 OperatorGraph.pdf
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> SharedWorkOptimizer merges TableScan operators that have different DPP
> parents, which leads to the creation of semantically wrong query plan.
> In our environment, running TPC-DS query64 on 1TB Iceberg format table
> returns no rows because of this problem. (The correct result has 7094 rows.)
> We use hive.optimize.shared.work=true,
> hive.optimize.shared.work.extended=true, and
> hive.optimize.shared.work.dppunion=false to reproduce the bug.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)