In case anyone else comes across this. It looks like Sqoop automatically converts datetime columns to long/bigint.
I verified by inspecting one of the Avro files generated by Sqoop. java -jar avro-tools-1.7.4.jar tojson mydata.avro On Sun, Sep 29, 2013 at 1:52 PM, Justin <[email protected]> wrote: > Hello, > > I'm importing data from MySQL to HDFS with Sqoop. > > The import seems to go well. > > $ hadoop fs -ls /user/hive/warehouse/acme/mydb/mytable > Found 66 items > ... > -rw-r--r-- 3 cloudera hive 939604786 2013-09-29 > 11:20 /user/hive/warehouse/acme/mydb/mytable/part-m-00001.avro > -rw-r--r-- 3 cloudera hive 955864250 2013-09-29 08:19 > //user/hive/warehouse/acme/mydb/mytable/part-m-00002.avro > ... > > I then proceed to create my schema and create the table in Hive. It's a > fairly simply schema. All fields are integers (defined as int or long), > except one, and that is a datetime (MySQL data type) field. > > Everything (seemingly) goes off without a hitch until I try to query the > data in Hive. Here's the error I get when trying to do a simple "select > count(*) from mytable" > > java.io.IOException: java.io.IOException: > org.apache.avro.AvroTypeException: Found long, expecting string > > Again, the only column this could be is my datetime field. Well, at least, > it's the only string field. The field is defined like so: > > {"name": "datefound", "type": "string" } > > I have confirmed all the datetime values coming from MySQL are in the > standard "yyyy-mm-dd hh:mm:ss" format. > > Why would my jobs be encountering a integer? >
