[
https://issues.apache.org/jira/browse/HIVE-16602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172260#comment-16172260
]
Jesus Camacho Rodriguez commented on HIVE-16602:
------------------------------------------------
[~kellyzly], this has been tested and it makes a huge difference, specially for
IO intensive queries.
bq. ...it appears multiple times in the query.
What do you mean? When you use "explain plan", you should see that TS is reused
for the same table across different tasks. Otherwise the optimization might not
have been trigger. You can see multiple examples in the commit for this issue.
> Implement shared scans with Tez
> -------------------------------
>
> Key: HIVE-16602
> URL: https://issues.apache.org/jira/browse/HIVE-16602
> Project: Hive
> Issue Type: New Feature
> Components: Physical Optimizer
> Affects Versions: 3.0.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16602.01.patch, HIVE-16602.02.patch,
> HIVE-16602.03.patch, HIVE-16602.04.patch, HIVE-16602.patch
>
>
> Given a query plan, the goal is to identify scans on input tables that can be
> merged so the data is read only once. Optimization will be carried out at the
> physical level.
> In the longer term, identification of equivalent expressions and
> reutilization of intermediary results should be done at the logical layer via
> Spool operator.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)