Thanks or clarifying, I'll look into it too and see if I can find anything.
-Jayanth On Thu, Jun 21, 2012 at 10:47 PM, Jerome Banks <jer...@klout.com> wrote: > set hive.exec.parallel=true; > > This will run Hive jobs in parallel, if they are able to do so. > > As for multi-threading in the actual job itself, I don't think so, but I'm > not sure. The query planner will merge steps together, in order to try to > minimize the number of MR jobs needed to run a query, but I think those are > chained together in a single thread, both on the mapper and reduce. > > When I was at Quantcast, we had some multi-threading in the mapper ands > reducers, to try to increase throughput, by utilizing the CPU when the job > would otherwise be blocked on IO. This helps out, if your IO is very slow, > but if the IO no longer becomes a bottleneck, then you spend a lot of time > context-switching, and it no longer efficient. > > Interesting question, I'll look into it some more. Let me know if you find > out anything. > > -- jerome > > On Thu, Jun 21, 2012 at 1:16 AM, Jayanth Muthya <jayanthmut...@gmail.com > >wrote: > > > Hi, > > I was looking into some of the source code for hive. And had a few > > questions regarding parallelism in hive. Can a map task in > > hive exploit parallelism and run multiple threads? If it can do that, > does > > it do it by default? or does a user have to configure the settings? > > This question seems really basic, I just started looking into > hadoop/hive. > > Thanks in advance! > > > > -Jay > > >