[ https://issues.apache.org/jira/browse/SQOOP-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Mateus Pires updated SQOOP-3421: --------------------------------------- Description: Hi there, I'm trying to run the following to import an Oracle table into S3 as Parquet: {code:bash} sqoop import --connect jdbc:oracle:thin:@//some.host:1521/ORCL --where="rownum < 100" --table SOME_SCHEMA.SOME_TABLE_NAME --password some_password --username some_username --num-mappers 4 --split-by PRD_ID --target-dir s3n://bucket/destination --temporary-rootdir s3n://bucket/temp/destination --compress --check-column PRD_MODIFY_DT --incremental lastmodified --map-column-java PRD_ATTR_TEXT=String --append {code} Version of Kite is: kite-data-s3-1.1.0.jar Version of Sqoop is: 1.4.7 And I'm getting the following error: {code:text} 19/01/21 13:20:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM SOME_SCHEMA.SOME_TABLE_NAME t WHERE 1=0 19/01/21 13:20:34 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.dist/hive-site.xml 19/01/21 13:20:35 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric (plus '_') org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric (plus '_') at org.kitesdk.data.ValidationException.check(ValidationException.java:55) at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:105) at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:68) at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209) at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137) at org.kitesdk.data.Datasets.create(Datasets.java:239) at org.kitesdk.data.Datasets.create(Datasets.java:307) at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:156) at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:130) at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:132) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:454) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) {code} Importing as text file instead solves the issue was: Hi there, I'm trying to run the following to import an Oracle table into S3 as Parquet: sqoop import --connect jdbc:oracle:thin:@//some.host:1521/ORCL --where="rownum < 100" --table SOME_SCHEMA.SOME_TABLE_NAME --password some_password --username some_username --num-mappers 4 --split-by PRD_ID --target-dir s3n://bucket/destination --temporary-rootdir s3n://bucket/temp/destination --compress --check-column PRD_MODIFY_DT --incremental lastmodified --map-column-java PRD_ATTR_TEXT=String --append Version of Kite is: kite-data-s3-1.1.0.jar Version of Sqoop is: 1.4.7 And I'm getting the following error: {code:text} 19/01/21 13:20:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM SOME_SCHEMA.SOME_TABLE_NAME t WHERE 1=0 19/01/21 13:20:34 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.dist/hive-site.xml 19/01/21 13:20:35 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric (plus '_') org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric (plus '_') at org.kitesdk.data.ValidationException.check(ValidationException.java:55) at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:105) at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:68) at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209) at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137) at org.kitesdk.data.Datasets.create(Datasets.java:239) at org.kitesdk.data.Datasets.create(Datasets.java:307) at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:156) at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:130) at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:132) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:454) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) {code} Importing as text file instead solves the issue > Importing data from Oracle to Parquet as incremental dataset name fails > ----------------------------------------------------------------------- > > Key: SQOOP-3421 > URL: https://issues.apache.org/jira/browse/SQOOP-3421 > Project: Sqoop > Issue Type: Bug > Affects Versions: 1.4.7 > Reporter: Daniel Mateus Pires > Priority: Minor > > Hi there, I'm trying to run the following to import an Oracle table into S3 > as Parquet: > {code:bash} > sqoop import --connect jdbc:oracle:thin:@//some.host:1521/ORCL > --where="rownum < 100" --table SOME_SCHEMA.SOME_TABLE_NAME --password > some_password --username some_username --num-mappers 4 --split-by PRD_ID > --target-dir s3n://bucket/destination --temporary-rootdir > s3n://bucket/temp/destination --compress --check-column PRD_MODIFY_DT > --incremental lastmodified --map-column-java PRD_ATTR_TEXT=String --append > {code} > Version of Kite is: kite-data-s3-1.1.0.jar > Version of Sqoop is: 1.4.7 > And I'm getting the following error: > {code:text} > 19/01/21 13:20:33 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM SOME_SCHEMA.SOME_TABLE_NAME t WHERE 1=0 > 19/01/21 13:20:34 INFO conf.HiveConf: Found configuration file > file:/etc/hive/conf.dist/hive-site.xml > 19/01/21 13:20:35 ERROR sqoop.Sqoop: Got exception running Sqoop: > org.kitesdk.data.ValidationException: Dataset name > 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not > alphanumeric (plus '_') > org.kitesdk.data.ValidationException: Dataset name > 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not > alphanumeric (plus '_') > at > org.kitesdk.data.ValidationException.check(ValidationException.java:55) > at > org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:105) > at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:68) > at > org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209) > at > org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137) > at org.kitesdk.data.Datasets.create(Datasets.java:239) > at org.kitesdk.data.Datasets.create(Datasets.java:307) > at > org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:156) > at > org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:130) > at > org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:132) > at > org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264) > at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) > at > org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:454) > at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520) > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) > at org.apache.sqoop.Sqoop.run(Sqoop.java:147) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) > at org.apache.sqoop.Sqoop.main(Sqoop.java:252) > {code} > Importing as text file instead solves the issue -- This message was sent by Atlassian JIRA (v7.6.3#76005)