[ 
https://issues.apache.org/jira/browse/SQOOP-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222235#comment-14222235
 ] 

Josh Wills commented on SQOOP-1780:
-----------------------------------

I think it's something to do w/Parquet itself-- if you look in 
DataDrivenImportJob, you'll see that Parquet is using the same 
AvroSchemaGenerator object that the Avro data file code uses.

> Avro/Parquet schemas can't handle Sqoop-generated non-alphanumeric column 
> names
> -------------------------------------------------------------------------------
>
>                 Key: SQOOP-1780
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1780
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.5
>            Reporter: Josh Wills
>
> I was importing a MySQL table that had columns that started with a number 
> (1QP, 2QP, etc.). It looks like Sqoop appends an underscore on the front of 
> those names to make them compatible with Hive, but Parquet/Avro schemas can't 
> handle the non-alphanumeric value in the name of a field (or at least, at the 
> start of it), throwing the following exception:
> {code}
> java.lang.IllegalStateException: Deprecated: field names are not alphanumeric 
> (plus '_'): sqoop_import_team._1QP, sqoop_import_team._2QP, 
> sqoop_import_team._3QP, sqoop_import_team._4QP
>       at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:172)
>       at 
> org.kitesdk.data.spi.Compatibility.checkSchema(Compatibility.java:119)
>       at 
> org.kitesdk.data.spi.Compatibility.checkDescriptor(Compatibility.java:133)
>       at 
> org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:40)
>       at 
> org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:76)
>       at org.kitesdk.data.Datasets.create(Datasets.java:200)
>       at org.kitesdk.data.Datasets.create(Datasets.java:240)
>       at 
> org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:81)
>       at 
> org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:70)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to