Thanks or clarifying, I'll look into it too and see if I can find anything.

-Jayanth

On Thu, Jun 21, 2012 at 10:47 PM, Jerome Banks <jer...@klout.com> wrote:

> set hive.exec.parallel=true;
>
> This will run Hive jobs in parallel, if they are able to do so.
>
> As for multi-threading in the actual job itself, I don't think so, but I'm
> not sure. The query planner will merge steps together, in order to try to
> minimize the number of MR jobs needed to run a query, but I think those are
> chained together in a single thread, both on the mapper and reduce.
>
> When I was at Quantcast, we had some multi-threading in the mapper ands
> reducers, to try to increase throughput, by utilizing the CPU when the job
> would otherwise be blocked on IO.  This helps out, if your IO is very slow,
> but if the IO no longer becomes a bottleneck, then you spend a lot of time
> context-switching, and it no longer efficient.
>
> Interesting question, I'll look into it some more. Let me know if you find
> out anything.
>
> -- jerome
>
> On Thu, Jun 21, 2012 at 1:16 AM, Jayanth Muthya <jayanthmut...@gmail.com
> >wrote:
>
> > Hi,
> > I was looking into some of the source code for hive. And had a few
> > questions regarding parallelism in hive. Can a map task in
> > hive exploit parallelism and run multiple threads? If it can do that,
> does
> > it do it by default? or does a user have to configure the settings?
> > This question seems really basic, I just started looking into
> hadoop/hive.
> > Thanks in advance!
> >
> > -Jay
> >
>

Reply via email to