[
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781643#action_12781643
]
Chaitanya Mishra commented on HIVE-549:
---------------------------------------
This patch differs from the prev patch in the following ways:
(a) mapredWork is not stored as a threadlocal variable. Instead we maintain a
map from jobname -> mapredwork. This was essential since Hive can launch tasks
in localjobrunner mode. This is pretty much like Zheng's original suggestion.
(b) There is additional code to ensure that a job always has a randomly
generated name, to ensure that the code doesn't break.
(c) There is also code to ensure that the distributed cache has a unique handle
for plan information. Originally it was always stored as HIVE_PLAN
(d) Sessionstate was a threadlocal variable. Therefore new code to initlialize
sessionstate for new threads has been put in.
(e) Only map-reduce tasks are launched using new threads. Non map-reduce tasks
are launched within the same driver thread. This is to ensure that simple tasks
like describe function don't pay the cost of threadlaunching + sleeping and
polling for threads.
(f) At most maxthreads=8 threads are launched.
> Parallel Execution Mechanism
> ----------------------------
>
> Key: HIVE-549
> URL: https://issues.apache.org/jira/browse/HIVE-549
> Project: Hadoop Hive
> Issue Type: Wish
> Components: Query Processor
> Reporter: Adam Kramer
> Assignee: Chaitanya Mishra
> Attachments: HIVE-549-v4.patch, HIVE-549-v5.patch
>
>
> In a massively parallel database system, it would be awesome to also
> parallelize some of the mapreduce phases that our data needs to go through.
> One example that just occurred to me is UNION ALL: when you union two SELECT
> statements, effectively you could run those statements in parallel. There's
> no situation (that I can think of, but I don't have a formal proof) in which
> the left statement would rely on the right statement, or vice versa. So, they
> could be run at the same time...and perhaps they should be. Or, perhaps there
> should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.