Avro support doesn't take Java reserved words into account
----------------------------------------------------------
Key: SQOOP-429
URL: https://issues.apache.org/jira/browse/SQOOP-429
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.4.0-incubating
Reporter: Lars Francke
We have a table with a column named {{class}} which Sqoops renames to
{{_class}} internally. That's working great until it comes to the Avro support.
The generated Avro schema has a field called {{class}} as well but in
{{AvroImportMapper#toGenericRecord}} the {{SqoopRecord#getFieldMap}} method is
called which returns the changed column name. This leads to a
{{NullPointerException}} in {{GenericData$Record#put}} because it tries to find
the wrong field.
I'm far from understanding Sqoops internals but it seems like there are two
solutions: Either change the generated Avro Schema (which would probably be an
easy but annoying fix) or somehow check if a field from the SqoopRecord was
renamed due to a reserved word and then rename it back here. I'd love to
provide a patch for this as we need this to work but I don't know which way's
preferred and I would need to do a bit of digging.
{code}
java.lang.NullPointerException
at org.apache.avro.generic.GenericData$Record.put(GenericData.java:58)
at
org.apache.sqoop.mapreduce.AvroImportMapper.toGenericRecord(AvroImportMapper.java:68)
at
org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:59)
at
org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:43)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:183)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira