[
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16283012#comment-16283012
]
liyunzhang commented on HIVE-17486:
-----------------------------------
[~xuefuz]: have uploaded the [design
doc|https://docs.google.com/document/d/1f4f0oMhN2vKSTCtXbnd3FBYOV02H4QflX1BbkglnC30/edit?usp=sharing].
I described the problems i met in the [Problem
Section|https://docs.google.com/document/d/1f4f0oMhN2vKSTCtXbnd3FBYOV02H4QflX1BbkglnC30/edit#heading=h.d0ptagvbv8k3],
please help view the problem if have time, thanks!
> Enable SharedWorkOptimizer in tez on HOS
> ----------------------------------------
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
> Issue Type: Bug
> Reporter: liyunzhang
> Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, explain.28.share.false,
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be
> merged so the data is read only once. Optimization will be carried out at the
> physical level. In Hive on Spark, it caches the result of spark work if the
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer
> is enabled in physical plan in HoS, the identical table scans are merged to 1
> table scan. This result of table scan will be used by more 1 child spark
> work. Thus we need not do the same computation because of cache mechanism.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)