[ 
https://issues.apache.org/jira/browse/SQOOP-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated SQOOP-1283:
-------------------------------

    Comment: was deleted

(was: Thanks Harsh! I'd prefer if Sqoop did the detection regardless of the 
file extension... it's one less thing for users to worry about. If you've 
already got the backing files without .avro then having to transform a large 
table is annoying...

EDIT: I see you have posted a patch to do just that, thanks!)

> Export doesn't detect Avro files without .avro extension (ie created by Hive)
> -----------------------------------------------------------------------------
>
>                 Key: SQOOP-1283
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1283
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/postgresql, hive-integration
>    Affects Versions: 1.4.3
>         Environment: CDH 4.5
>            Reporter: Hari Sekhon
>
> Exporting to PostgreSQL, Sqoop doesn't detect Avro files properly if they 
> don't have the .avro extension (ie they are called 000000_0 in HDFS as they 
> were created by Hive) and falls back to unknown file type in the code, which 
> then attempts to use Text export mapper which fails with a parse exception:
> java.io.IOException: Can't export data, please check failed map task logs 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>  
> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Caused by: java.lang.RuntimeException: Can't parse input data: 
> 'Objavro.codecdeflateavro.schema�{"type":"record","name":"<scrubbed>","namespace":"<scrubbed>.avro","fields":[{"name":"pane
>  
> 14/02/03 17:13:52 INFO mapred.JobClient: Task Id : 
> attempt_201312101527_93532_m_000000_0, Status : FAILED 
> java.io.IOException: Can't export data, please check failed map task logs 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>  
> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Thanks
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to