[jira] [Updated] (HIVE-8118) Support work that have multiple child works to work around SPARK [Spark Branch]

Xuefu Zhang (JIRA) Sat, 11 Oct 2014 20:07:16 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xuefu Zhang updated HIVE-8118:
------------------------------
    Summary: Support work that have multiple child works to work around SPARK  
[Spark Branch]  (was: SparkMapRecorderHandler and SparkReduceRecordHandler 
should be initialized with multiple result collectors [Spark Branch])

> Support work that have multiple child works to work around SPARK  [Spark 
> Branch]
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-8118
>                 URL: https://issues.apache.org/jira/browse/HIVE-8118
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Chao
>              Labels: Spark-M1
>         Attachments: HIVE-8118.pdf
>
>
> In the current implementation, both SparkMapRecordHandler and 
> SparkReduceRecorderHandler takes only one result collector, which limits that 
> the corresponding map or reduce task can have only one child. It's very 
> comment in multi-insert queries where a map/reduce task has more than one 
> children. A query like the following has two map tasks as parents:
> {code}
> select name, sum(value) from dec group by name union all select name, value 
> from dec order by name
> {code}
> It's possible in the future an optimation may be implemented so that a map 
> work is followed by two reduce works and then connected to a union work.
> Thus, we should take this as a general case. Tez is currently providing a 
> collector for each child operator in the map-side or reduce side operator 
> tree. We can take Tez as a reference.
> Likely this is a big change and subtasks are possible. 
> With this, we can have a simpler and clean multi-insert implementation. This 
> is also the problem observed in HIVE-7731.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8118) Support work that have multiple child works to work around SPARK [Spark Branch]

Reply via email to