Take a look at [Combine]HiveInputFormat; they are what we wrap around your input formats in order to allow Hive to access data from multiple input formats in the same job.
JVS On Jul 1, 2010, at 10:16 AM, yan qi wrote: sHi, Namit, Thanks a lot for your reply! I checked the source code. Given a query, (select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1)), there is only a MapReduce job generated. As far as I know, the function setInputFormat would be used to set the job's InputFormat class, in the ExecDriver.java. Then I didn't see any chance to set two different InputFormat classes in one job. Or did I miss something here? Thanks, On Thu, Jul 1, 2010 at 10:00 AM, Namit Jain <[email protected]<mailto:[email protected]>> wrote: That's fine The 2 tables can have different inputformats Sent from my iPhone On Jul 1, 2010, at 9:51 AM, "yan qi" <[email protected]<mailto:[email protected]>> wrote: > Hi, > > I have a question about the JOIN operation in Hive. > > For example, I have a query, like > > select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1); > > Clearly, there is a JOIN involved in the statement. > 1. tmp2 and tmp7 are two tables. > 2. c2 and c1 are columns belonging to tmp7 and tmp2 respectively. > > I found that this query is executed in Hive with a MapReduce Job. > Therefore, I am wondering if tmp2 and tmp7 are both assumed to share > the same InputFormat class. > > What if tmp2 and tmp7 are using different InputFormat classes to > read records? > > > Thanks, > > WS >
