Yes usually there is a single map-reduce job. The reason that Hive says 2 map-reduce jobs is because there is a conditional task which will merge tiny files into smaller number of files. The conditional task It may or may not run depending on the output files sizes, and they can be disabled also.
Zheng On Mon, Mar 1, 2010 at 10:34 AM, Weiwei Hsieh <[email protected]> wrote: > Thank you all for helps!! Please bear with me. I have one more > question here: > > > > If I have table “t1 (id string, c1 string)” and “t2 (c1 string)”. Now I > have a statement of “insert overwrite table t1 (c1) select c1 from t2”, will > this be one task? I need id for each record in t1. > > > > From: Carl Steinbach [mailto:[email protected]] > Sent: Thursday, February 25, 2010 9:11 PM > To: [email protected] > Subject: Re: How to generate Row Id in Hive? > > > > Making JobConf accessible to UDFs is part of the plan behind HIVE-1016 > (Distributed Cache access for UDFs). I'll file a JIRA for a rowid() UDF and > link it to this. > > Carl > > On Thu, Feb 25, 2010 at 9:00 PM, Zheng Shao <[email protected]> wrote: > > Not right now. It should be pretty simple to do though. We can expose > the current JobConf via a static method in ExecMapper. > > Zheng > > On Thu, Feb 25, 2010 at 7:52 AM, Todd Lipcon <[email protected]> wrote: >> Zheng: is there a way to get at the hadoop conf variables from within a >> query? If so, you could use mapred.task.id to get a unique string. >> -Todd >> >> On Thu, Feb 25, 2010 at 12:42 AM, Zheng Shao <[email protected]> wrote: >>> >>> Since Hive runs many mappers/reducers in parallel, there is no way to >>> generate a globally unique increasing row id. >>> If you are OK with that, you can easily write a "non-deterministic" >>> UDF. See rand() (or UDFRand.java) for example. >>> >>> Please open a JIRA if you plan to work on that. >>> >>> Zheng >>> >>> On Wed, Feb 24, 2010 at 6:47 PM, Weiwei Hsieh <[email protected]> >>> wrote: >>> > All, >>> > >>> > >>> > >>> > Could anyone tell me on how to generate a row id for a new record in >>> > Hive? >>> > >>> > >>> > >>> > Many thanks. >>> > >>> > >>> > >>> > weiwei >>> >>> >>> >>> -- >>> Yours, >>> Zheng >> >> > > > -- > Yours, > Zheng > > -- Yours, Zheng
