Hi Min, We recently added a capability to Hive to merge small output files.
You can do the following to disable that feature: set hive.merge.mapfiles=false; OR you can adjust the following parameter to determine when the additional merge job should run: set hive.merge.size.per.task=256000000; By default it's 256MB which means if the average output of a mapper is smaller than 256MB, an additional job will run. You can set that number to something like 64MB if you want. Zheng On Mon, Aug 3, 2009 at 8:02 PM, Min Zhou<[email protected]> wrote: > I thought one map only job is ok. try > hive> explain insert overwrite table tmp partition(dt=1) select bar, foo > from pokes; > > > Thanks, > Min > -- > My research interests are distributed systems, parallel computing and > bytecode based virtual machine. > > My profile: > http://www.linkedin.com/in/coderplay > My blog: > http://coderplay.javaeye.com > -- Yours, Zheng
