[
https://issues.apache.org/jira/browse/HIVE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ravi Teja Chilukuri updated HIVE-16414:
---------------------------------------
Summary: [Hive on Tez] Hive Union queries resource efficiency less on Tez
than Mapreduce (was: [Hive on Tez] Union queries resources efficiency less on
Tez than Mapreduce)
> [Hive on Tez] Hive Union queries resource efficiency less on Tez than
> Mapreduce
> -------------------------------------------------------------------------------
>
> Key: HIVE-16414
> URL: https://issues.apache.org/jira/browse/HIVE-16414
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Affects Versions: 2.1.0
> Reporter: Ravi Teja Chilukuri
>
> When a hive union query with the sub queries reading the same table is run in
> Mapreduce and tez, Mapreduce reads the table only once, no matter how many
> reads on the same table are present,
> but tez reads the same table multiple times in the form of multiple vertices.
> If a table is to be read by X mappers,
> Tez runs with kX map tasks where k is the number of sub queries reading from
> the same table and
> Mapreduce runs with X mappers no matter how many sub queries are present.
> For such union queries, we need to fall back to MR instead of TEZ.
> *Query:*
> http://pastebin.com/t6n91u6a
> *Tez explain plan:*
> http://pastebin.com/aWwVxhii
> *MR explain plan:*
> http://pastebin.com/iDbWwtKR
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)