[ https://issues.apache.org/jira/browse/HIVE-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xuefu Zhang updated HIVE-8118: ------------------------------ Labels: Spark-M1 (was: ) > SparkMapRecorderHandler and SparkReduceRecordHandler should be initialized > with multiple result collectors[Spark Branch] > ------------------------------------------------------------------------------------------------------------------------ > > Key: HIVE-8118 > URL: https://issues.apache.org/jira/browse/HIVE-8118 > Project: Hive > Issue Type: Bug > Components: Spark > Reporter: Xuefu Zhang > Labels: Spark-M1 > > In the current implementation, both SparkMapRecordHandler and > SparkReduceRecorderHandler takes only one result collector, which limits that > the corresponding map or reduce task can have only one child. It's very > comment in multi-insert queries where a map/reduce task has more than one > children. A query like the following has two map tasks as parents: > {code} > select name, sum(value) from dec group by name union all select name, value > from dec order by name > {code} > It's possible in the future an optimation may be implemented so that a map > work is followed by two reduce works and then connected to a union work. > Thus, we should accommodate this. Tez is currently providing a collector for > each child operator in the map-side or reduce side operator tree. > Likely this is a big change. With this, we can have a simpler and clean > multi-insert implementation. > This is also the problem observed in HIVE-7731. -- This message was sent by Atlassian JIRA (v6.3.4#6332)