Thanks Nitin..but to take care of that I had cleaned the csv files of leading and trailing spaces before putting into hdfs.Also ran the dos2unix command on the csv files.
Only if I define the external table with all fields data type as STRING the joins perform properly.Even when load the data initially into a table with all STRING fields and at a latter point copy the data to a different table with proper data type, the joins give wrong result on the new table also. On Wed, Aug 15, 2012 at 1:14 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > it might be the case that there are few empty spaces at the end of > each row which are being handled when you are reading and writing from > disc > > but when you set autoconvert then looks like one of these tables is > really small and it is converted into mapside join > which means the entire table is loaded into map memory and there is no > need of reduce > > On Wed, Aug 15, 2012 at 9:13 PM, Himanish Kushary <himan...@gmail.com> > wrote: > > Hi, > > > > I have uploaded few csv files from windows into hive and configured few > > external tables using them. When I am trying to run a join on two tables > one > > of the int columns > > get changed to 0. The structure of the tables are as follows: > > > > > > Table-1 Table-2 > > ------------ ----------- > > > > Id(int) id(int) datetime > > eid(int) > > -- ---- ------------ > > ----- > > 1 1 2011-02-01 3 > > 2 1 2011-03-01 4 > > 3 2 2011-04-01 5 > > 4 2011-05-01 6 > > 6 2011-06-01 7 > > > > > > The join query is - select a.* from Table-2 a join Table-1 b on (a.id= > b.id); > > > > The output is: > > > > 1 2011-02-01 0 > > 1 2011-03-01 0 > > 2 2011-04-01 0 > > > > > > I checked the logs and noticed the following warning : WARN > > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct: Extra bytes > > detected at the end of the row! Ignoring similar problems.Could this be > > causing it ? > > > > When I turn on hive.auto.convert.join=true , the error goes away as > there is > > no reduce phase.The output is: > > > > 1 2011-02-01 3 > > 1 2011-03-01 4 > > 2 2011-04-01 5 > > > > Could somebody please help me figure out why we get the wrong results > when > > running through the reducer. > > -- > > Thanks > > > > -- > Nitin Pawar > -- Thanks & Regards Himanish