Re: why insert overwrite table tmp partition(dt=1) select bar, foo from pokes NEEDS 2 MR JOBS?

Min Zhou Wed, 05 Aug 2009 21:28:04 -0700

Got  it . Thanks a lot , Zheng and Ashish!

Min


On Thu, Aug 6, 2009 at 2:59 AM, Ashish Thusoo <[email protected]> wrote:

>  Not sure if this got answered. The second MR job in this case is for
> concatenating the outputs so that the files generated are much less than the
> mapper parallelism. This has advantages for jobs that consume the data. This
> feature was added recently. You can however turn it off using the following
> configuration variable.
>
> hive.merge.mapfiles=false
>
> This is true by default.
>
> Ashish
>  ------------------------------
> *From:* Min Zhou [mailto:[email protected]]
> *Sent:* Monday, August 03, 2009 8:02 PM
> *To:* hive-user
> *Subject:* why insert overwrite table tmp partition(dt=1) select bar, foo
> from pokes NEEDS 2 MR JOBS?
>
> I thought one map only job is ok. try
> hive> explain insert overwrite table tmp partition(dt=1) select bar, foo
> from pokes;
>
>
> Thanks,
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: why insert overwrite table tmp partition(dt=1) select bar, foo from pokes NEEDS 2 MR JOBS?

Reply via email to