[
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769413#action_12769413
]
Chaitanya Mishra commented on HIVE-549:
---------------------------------------
Zheng and I had a email-discussion on this. To launch multiple tasks from the
same driver, we'll need to launch each task as a separate thread.
The simplest solution is to extend Task to implement the Runnable Interface,
but this might affect other components... Not completely sure about this since
an element that implements runnable can still be called in a sequential fashion.
Zheng instead proposes adding a new class TaskRunner, with functions:
run(): asynchronous call
start(),
wait(long timeoutMilli),
and stop()
Most likely, if we get taskrunner to extend Thread, we get these functions; if
not we can implement runnable.
So, do we get Task to implement runnable, or add a layer of indirection as
TaskRunner?
> Parallel Execution Mechanism
> ----------------------------
>
> Key: HIVE-549
> URL: https://issues.apache.org/jira/browse/HIVE-549
> Project: Hadoop Hive
> Issue Type: Wish
> Components: Query Processor
> Reporter: Adam Kramer
> Assignee: Chaitanya Mishra
>
> In a massively parallel database system, it would be awesome to also
> parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT
> statements, effectively you could run those statements in parallel. There's
> no situation (that I can think of, but I don't have a formal proof) in which
> the left statement would rely on the right statement, or vice versa. So, they
> could be run at the same time...and perhaps they should be. Or, perhaps there
> should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.