Got it . Thanks a lot , Zheng and Ashish! Min
On Thu, Aug 6, 2009 at 2:59 AM, Ashish Thusoo <[email protected]> wrote: > Not sure if this got answered. The second MR job in this case is for > concatenating the outputs so that the files generated are much less than the > mapper parallelism. This has advantages for jobs that consume the data. This > feature was added recently. You can however turn it off using the following > configuration variable. > > hive.merge.mapfiles=false > > This is true by default. > > Ashish > ------------------------------ > *From:* Min Zhou [mailto:[email protected]] > *Sent:* Monday, August 03, 2009 8:02 PM > *To:* hive-user > *Subject:* why insert overwrite table tmp partition(dt=1) select bar, foo > from pokes NEEDS 2 MR JOBS? > > I thought one map only job is ok. try > hive> explain insert overwrite table tmp partition(dt=1) select bar, foo > from pokes; > > > Thanks, > Min > -- > My research interests are distributed systems, parallel computing and > bytecode based virtual machine. > > My profile: > http://www.linkedin.com/in/coderplay > My blog: > http://coderplay.javaeye.com > -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com
