[jira] Commented: (HIVE-549) UNION ALL statements should be run in parallel

Namit Jain (JIRA) Mon, 08 Jun 2009 10:08:30 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12717327#action_12717327
 ]


Namit Jain commented on HIVE-549:
---------------------------------

This is not specific to UNION ALL - Hive maintains the complete task dependency 
tree and can execute tasks in parallel.
It will help latency substantially. Should not help the overall cluster usage, 
but will be specially useful for benchmarks.

Cant think of any reason of any parallelizing. I dont see any reason to change 
the plan - while walking the task tree,
execute all tasks whose dependencies have been executed.



> UNION ALL statements should be run in parallel
> ----------------------------------------------
>
>                 Key: HIVE-549
>                 URL: https://issues.apache.org/jira/browse/HIVE-549
>             Project: Hadoop Hive
>          Issue Type: Wish
>          Components: Query Processor
>            Reporter: Adam Kramer
>
> In a massively parallel database system, it would be awesome to also 
> parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT 
> statements, effectively you could run those statements in parallel. There's 
> no situation (that I can think of, but I don't have a formal proof) in which 
> the left statement would rely on the right statement, or vice versa. So, they 
> could be run at the same time...and perhaps they should be. Or, perhaps there 
> should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-549) UNION ALL statements should be run in parallel

Reply via email to