Hi guys,

I am using sqoop 1.4.5 to import some data from MySQL into hive using this 
command:

sqoop import --connect jdbc:mysql://some.merck.com:1234/eqtl_gtex_raw 
--username XXX --password YYY --table adipose_subcutaneous --hcatalog-database 
mg_user_middlegate_benesp_mysql1 --hcatalog-table adipose_subcutaneous 
--hive-partition-key mg_version --hive-partition-value 2015-05-28-13-18 -m 1 
--verbose --fetch-size -2147483648

and it fails with this error

2015-06-01 13:20:39,209 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.NullPointerException
        at 
org.apache.hive.hcatalog.data.schema.HCatSchema.get(HCatSchema.java:105)
        at 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper.convertToHCatRecord(SqoopHCatImportHelper.java:194)
        at 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:52)
        at 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:34)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


after some investigation it seems to be caused by hyphens in a table name. I 
have patched sqoop jar to write more info into a log:


2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing schema 
fields...
2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Adding field 'mg_version'
2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Field count: 6
2015-06-01 13:15:49,347 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Working on split: 1=1 AND 1=1
2015-06-01 13:15:49,360 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: SELECT `SNP`, 
`gene`, `beta`, ` t-stat`, `p-value` FROM `adipose_subcutaneous` AS 
`adipose_subcutaneous` WHERE ( 1=1 ) AND ( 1=1 )
2015-06-01 13:15:49,657 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing  HCatRecord, 
listing schema fields ...
2015-06-01 13:15:49,657 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: snp
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: gene
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: beta
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field:  t-stat
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: p-value
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: mg_version
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'SNP'
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'beta'
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'gene'
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'p_value'
2015-06-01 13:20:39,209 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.NullPointerException

According to it the original DB table names are converted to lowercase and '-' 
characters are replaced by sqoop. The tables without hyphens are resolved 
correctly (e.g. 'SNP' -> 'snp') but the table with hyphens (i.e. 'p-value' -> 
'p_value' ) is not found in a schema.

I am attaching also sqoop log and job log.

Is this a known issue and is there any workaround for it? This should be 
general import/ingest so unfortunately I have no control over table names to 
ingest.

Thanks,

Pavel


Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (2000 Galloping Hill Road, Kenilworth,
New Jersey, USA 07033), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.
Log Type: syslog
Log Upload Time: Mon Jun 01 13:35:50 +0000 2015
Log Length: 6148
2015-06-01 13:15:46,891 WARN [main] 
org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: 
tried hadoop-metrics2-maptask.properties,hadoop-metrics2.properties
2015-06-01 13:15:46,953 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 
10 second(s).
2015-06-01 13:15:46,953 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
started
2015-06-01 13:15:46,963 INFO [main] org.apache.hadoop.mapred.YarnChild: 
Executing with tokens:
2015-06-01 13:15:46,963 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
mapreduce.job, Service: job_1433145248836_0011, Ident: 
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@23428b92)
2015-06-01 13:15:47,049 INFO [main] org.apache.hadoop.mapred.YarnChild: 
Sleeping for 0ms before retrying again. Got null now.
2015-06-01 13:15:47,316 INFO [main] org.apache.hadoop.mapred.YarnChild: 
mapreduce.cluster.local.dir for child: 
/media/ephemeral0/hadoop/yarn/local/usercache/ec2-user/appcache/application_1433145248836_0011,/media/ephemeral1/hadoop/yarn/local/usercache/ec2-user/appcache/application_1433145248836_0011
2015-06-01 13:15:47,909 INFO [main] 
org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
Instead, use dfs.metrics.session-id
2015-06-01 13:15:48,481 INFO [main] 
org.apache.hadoop.conf.Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2015-06-01 13:15:48,488 INFO [main] 
org.apache.hadoop.conf.Configuration.deprecation: mapred.work.output.dir is 
deprecated. Instead, use mapreduce.task.output.dir
2015-06-01 13:15:48,513 INFO [main] org.apache.hadoop.mapred.Task:  Using 
ResourceCalculatorProcessTree : [ ]
2015-06-01 13:15:48,978 INFO [main] 
org.apache.sqoop.mapreduce.db.DBInputFormat: Using read commited transaction 
isolation
2015-06-01 13:15:49,153 INFO [main] org.apache.hadoop.mapred.MapTask: 
Processing split: 1=1 AND 1=1
2015-06-01 13:15:49,184 INFO [main] 
org.apache.hadoop.conf.Configuration.deprecation: mapred.output.key.class is 
deprecated. Instead, use mapreduce.job.output.key.class
2015-06-01 13:15:49,188 INFO [main] 
org.apache.hadoop.conf.Configuration.deprecation: mapred.output.value.class is 
deprecated. Instead, use mapreduce.job.output.value.class
2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: HCatalog Storer Info1 : 
        Handler = null
        Input format class = org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
        Output format class = org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
        Serde class = org.apache.hadoop.hive.ql.io.orc.OrcSerde
Storer properties 
        transient_lastDdlTime=1432909549
        serialization.format=1

2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing schema 
fields...
2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Adding field 'mg_version'
2015-06-01 13:15:49,337 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Field count: 6
2015-06-01 13:15:49,347 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Working on split: 1=1 AND 1=1
2015-06-01 13:15:49,360 INFO [main] 
org.apache.sqoop.mapreduce.db.DBRecordReader: Executing query: SELECT `SNP`, 
`gene`, `beta`, ` t-stat`, `p-value` FROM `adipose_subcutaneous` AS 
`adipose_subcutaneous` WHERE ( 1=1 ) AND ( 1=1 )
2015-06-01 13:15:49,657 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing converting 
HCatRecord, listing schema fields ...
2015-06-01 13:15:49,657 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: snp
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: gene
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: beta
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field:  t-stat
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: p-value
2015-06-01 13:15:49,663 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper:      Field: mg_version
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'SNP'
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'beta'
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'gene'
2015-06-01 13:15:49,664 INFO [main] 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper: Processing key: 'p_value'
2015-06-01 13:20:39,209 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.NullPointerException
        at 
org.apache.hive.hcatalog.data.schema.HCatSchema.get(HCatSchema.java:105)
        at 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper.convertToHCatRecord(SqoopHCatImportHelper.java:194)
        at 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:52)
        at 
org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:34)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2015-06-01 13:20:39,215 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
cleanup for the task
2015-06-01 13:20:39,230 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping MapTask metrics 
system...
2015-06-01 13:20:39,231 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
stopped.
2015-06-01 13:20:39,231 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
shutdown complete.

Reply via email to