[
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721385#action_12721385
]
JArod Wen commented on HIVE-549:
--------------------------------
Intuitively in the query DAG the sub-branches of a given node can be
paralleled. Since the DAG can be generated in several ways, there must be some
of them which may be much more efficient than others considering the possible
parallel tasks, right? So some changes in the optimization when compiling will
be cool. Anyway, I agree with Namit about the first step on the scheduler.
> Parallel Execution Mechanism
> ----------------------------
>
> Key: HIVE-549
> URL: https://issues.apache.org/jira/browse/HIVE-549
> Project: Hadoop Hive
> Issue Type: Wish
> Components: Query Processor
> Reporter: Adam Kramer
>
> In a massively parallel database system, it would be awesome to also
> parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT
> statements, effectively you could run those statements in parallel. There's
> no situation (that I can think of, but I don't have a formal proof) in which
> the left statement would rely on the right statement, or vice versa. So, they
> could be run at the same time...and perhaps they should be. Or, perhaps there
> should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.