Almost all operations in hive can exploit map reduce for parallelism. (isnt not really done on the thread level) essentially if you run a hive job and there is multiple mappers or reducers it was parallelism.
On Fri, Jun 22, 2012 at 5:14 AM, Jayanth Muthya <jayanthmut...@gmail.com> wrote: > Thanks or clarifying, I'll look into it too and see if I can find anything. > > -Jayanth > > On Thu, Jun 21, 2012 at 10:47 PM, Jerome Banks <jer...@klout.com> wrote: > >> set hive.exec.parallel=true; >> >> This will run Hive jobs in parallel, if they are able to do so. >> >> As for multi-threading in the actual job itself, I don't think so, but I'm >> not sure. The query planner will merge steps together, in order to try to >> minimize the number of MR jobs needed to run a query, but I think those are >> chained together in a single thread, both on the mapper and reduce. >> >> When I was at Quantcast, we had some multi-threading in the mapper ands >> reducers, to try to increase throughput, by utilizing the CPU when the job >> would otherwise be blocked on IO. This helps out, if your IO is very slow, >> but if the IO no longer becomes a bottleneck, then you spend a lot of time >> context-switching, and it no longer efficient. >> >> Interesting question, I'll look into it some more. Let me know if you find >> out anything. >> >> -- jerome >> >> On Thu, Jun 21, 2012 at 1:16 AM, Jayanth Muthya <jayanthmut...@gmail.com >> >wrote: >> >> > Hi, >> > I was looking into some of the source code for hive. And had a few >> > questions regarding parallelism in hive. Can a map task in >> > hive exploit parallelism and run multiple threads? If it can do that, >> does >> > it do it by default? or does a user have to configure the settings? >> > This question seems really basic, I just started looking into >> hadoop/hive. >> > Thanks in advance! >> > >> > -Jay >> > >>