[jira] [Updated] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

liyunzhang_intel (JIRA) Wed, 27 Sep 2017 19:18:52 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


liyunzhang_intel updated HIVE-17486:
------------------------------------
    Description: 
in HIVE-16602, Implement shared scans with Tez.

Given a query plan, the goal is to identify scans on input tables that can be 
merged so the data is read only once. Optimization will be carried out at the 
physical level.  In Hive on Spark, it caches the result of spark work if the 
spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
is enabled in physical plan in HoS, the identical table scans are merged to 1 
table scan. This result of table scan will be used by more 1 child spark work. 
Thus we need not do the same computation because of cache mechanism.

  was:
in HIVE-16602, Implement shared scans with Tez.

Given a query plan, the goal is to identify scans on input tables that can be 
merged so the data is read only once. Optimization will be carried out at the 
physical level.  In Hive on Spark, it caches the result ofsSpark work if the 
spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
is enabled in physical plan in HoS, the identical table scans are merged to 1 
table scan. This result of table scan will be used by more 1 child spark work. 
Thus we need not do the same computation because of cache mechanism.


> Enable SharedWorkOptimizer in tez on HOS
> ----------------------------------------
>
>                 Key: HIVE-17486
>                 URL: https://issues.apache.org/jira/browse/HIVE-17486
>             Project: Hive
>          Issue Type: Bug
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

Reply via email to