Hello,
I'm importing data from MySQL to HDFS with Sqoop.
The import seems to go well.
$ hadoop fs -ls /user/hive/warehouse/acme/mydb/mytable
Found 66 items
...
-rw-r--r-- 3 cloudera hive 939604786 2013-09-29
11:20 /user/hive/warehouse/acme/mydb/mytable/part-m-00001.avro
-rw-r--r-- 3 cloudera hive 955864250 2013-09-29 08:19
//user/hive/warehouse/acme/mydb/mytable/part-m-00002.avro
...
I then proceed to create my schema and create the table in Hive. It's a
fairly simply schema. All fields are integers (defined as int or long),
except one, and that is a datetime (MySQL data type) field.
Everything (seemingly) goes off without a hitch until I try to query the
data in Hive. Here's the error I get when trying to do a simple "select
count(*) from mytable"
java.io.IOException: java.io.IOException:
org.apache.avro.AvroTypeException: Found long, expecting string
Again, the only column this could be is my datetime field. Well, at least,
it's the only string field. The field is defined like so:
{"name": "datefound", "type": "string" }
I have confirmed all the datetime values coming from MySQL are in the
standard "yyyy-mm-dd hh:mm:ss" format.
Why would my jobs be encountering a integer?