sHi, Namit, Thanks a lot for your reply!
I checked the source code. Given a query, (select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1)), there is only a MapReduce job generated. As far as I know, the function setInputFormat would be used to set the job's InputFormat class, in the ExecDriver.java. Then I didn't see any chance to set two different InputFormat classes in one job. Or did I miss something here? Thanks, On Thu, Jul 1, 2010 at 10:00 AM, Namit Jain <[email protected]> wrote: > That's fine > The 2 tables can have different inputformats > > Sent from my iPhone > > On Jul 1, 2010, at 9:51 AM, "yan qi" <[email protected]> wrote: > > > Hi, > > > > I have a question about the JOIN operation in Hive. > > > > For example, I have a query, like > > > > select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1); > > > > Clearly, there is a JOIN involved in the statement. > > 1. tmp2 and tmp7 are two tables. > > 2. c2 and c1 are columns belonging to tmp7 and tmp2 respectively. > > > > I found that this query is executed in Hive with a MapReduce Job. > > Therefore, I am wondering if tmp2 and tmp7 are both assumed to share > > the same InputFormat class. > > > > What if tmp2 and tmp7 are using different InputFormat classes to > > read records? > > > > > > Thanks, > > > > WS > > >
