set hive.exec.parallel=true; This will run Hive jobs in parallel, if they are able to do so.
As for multi-threading in the actual job itself, I don't think so, but I'm not sure. The query planner will merge steps together, in order to try to minimize the number of MR jobs needed to run a query, but I think those are chained together in a single thread, both on the mapper and reduce. When I was at Quantcast, we had some multi-threading in the mapper ands reducers, to try to increase throughput, by utilizing the CPU when the job would otherwise be blocked on IO. This helps out, if your IO is very slow, but if the IO no longer becomes a bottleneck, then you spend a lot of time context-switching, and it no longer efficient. Interesting question, I'll look into it some more. Let me know if you find out anything. -- jerome On Thu, Jun 21, 2012 at 1:16 AM, Jayanth Muthya <jayanthmut...@gmail.com>wrote: > Hi, > I was looking into some of the source code for hive. And had a few > questions regarding parallelism in hive. Can a map task in > hive exploit parallelism and run multiple threads? If it can do that, does > it do it by default? or does a user have to configure the settings? > This question seems really basic, I just started looking into hadoop/hive. > Thanks in advance! > > -Jay >