[
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chaitanya Mishra updated HIVE-549:
----------------------------------
Attachment: Hive-549.patch
Attaching a patch for this. I hope its the right format.
Summary of changes:
- Created TaskRunner.java, which launches new tasks as threads.
- Created TaskResult.java, which encapsulates the return value of the thread.
- Modified execute() function of ql/Driver.java to launch tasks as soon as they
are runnable.
- Also, modified the Utilities.gWork variable to be ThreadLocal, so that the
state of multiple threads is kept independently.
The end result of this patch is that a task (which is a part of a query plan is
launched as soon as it is runnable, instead of waiting in a queue.
Comments?
> Parallel Execution Mechanism
> ----------------------------
>
> Key: HIVE-549
> URL: https://issues.apache.org/jira/browse/HIVE-549
> Project: Hadoop Hive
> Issue Type: Wish
> Components: Query Processor
> Reporter: Adam Kramer
> Assignee: Chaitanya Mishra
> Attachments: Hive-549.patch
>
>
> In a massively parallel database system, it would be awesome to also
> parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT
> statements, effectively you could run those statements in parallel. There's
> no situation (that I can think of, but I don't have a formal proof) in which
> the left statement would rely on the right statement, or vice versa. So, they
> could be run at the same time...and perhaps they should be. Or, perhaps there
> should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.