I find the following bug related to using incremental import and direct mode together.
* [SQOOP-1078] - incremental import from database in direct mode * [SQOOP-976] - Incorrect SQL when incremental criteria is text column Is there any syntax/data you can share about the correct syntax and sequence of switches for using incremental import into hive using direct mode import from mysql databases? Thanks, Suhas. ---------- Forwarded message ---------- From: Suhas Satish <[email protected]> Date: Thu, Oct 10, 2013 at 1:15 PM Subject: Re: sqoop incremental import fails- Violation of unique constraint SQOOP_SESSIONS_UNQ To: user <[email protected]> sqoop1.4.3 sqoop job --create signup_log --import --connect jdbc:mysql://mydb/u1 --table signup_log --username u1 --password <password> --hive-import --hive-table signup_log --incremental append --check-column sid --last-value 3276 --direct What I notice is that sqoop is not doing an incremental import but trying to do a full import from the beginning and fails because the map-reduce output directory already exists on hadoop file system. Is there a bug in sqoop command parsing of incremental import when the above parameters are used together? 13/10/04 00:10:36 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory signup_log already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:134) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:926) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:885) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:885) at org.apache.hadoop.mapreduce.Job.submit(Job.java:573) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:603) at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:173) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:151) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:221) at org.apache.sqoop.manager.DirectMySQLManager.importTable(DirectMySQLManager.java:92) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:403) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476) at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228) at org.apache.sqoop.tool.JobTool.run(JobTool.java:283) at org.apache.sqoop.Sqoop.run(Sqoop.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) at org.apache.sqoop.Sqoop.main(Sqoop.java:238) 13/10/04 00:10:36 DEBUG hsqldb.HsqldbJobStorage: Flushing current transaction 13/10/04 00:10:36 DEBUG hsqldb.HsqldbJobStorage: Closing connection ----------------------------------------------------------------------------------------------------- Cheers, Suhas. On Mon, Sep 30, 2013 at 10:42 AM, Jarek Jarcec Cecho <[email protected]>wrote: > Hi Suhas, > would you mind sharing with us Sqoop version, used Sqoop command line and > entire log generated with parameter --verbose? > > Jarcec > > On Fri, Sep 27, 2013 at 02:19:39PM -0700, Suhas Satish wrote: > > Hi Sqoop users, > > Does anyone know whats the fix for this? > > > > I'm trying to have incremental database changes imported into the > > metastore that hive uses. > > > > 3/09/25 16:50:00 INFO hive.HiveImport: Time taken: 2.053 seconds > > 13/09/25 16:50:00 INFO hive.HiveImport: Hive import complete. > > 13/09/25 16:50:00 INFO hive.HiveImport: Export directory is empty, > removing > > it. > > 13/09/25 16:50:00 INFO tool.ImportTool: Saving incremental import state > to > > the metastore > > 13/09/25 16:50:00 ERROR tool.ImportTool: Encountered IOException running > > import job: java.io.IOException: Error communicating with database > > at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage. > > createInternal(HsqldbJobStorage.java:426) > > at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage. > > update(HsqldbJobStorage.java:445) > > at org.apache.sqoop.tool.ImportTool.saveIncrementalState( > > ImportTool.java:130) > > at > org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:418) > > at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476) > > at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228) > > at org.apache.sqoop.tool.JobTool.run(JobTool.java:283) > > at org.apache.sqoop.Sqoop.run(Sqoop.java:145) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) > > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) > > at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) > > at org.apache.sqoop.Sqoop.main(Sqoop.java:238) > > Caused by: java.sql.SQLException: Violation of unique constraint > > SQOOP_SESSIONS_UNQ: duplicate value(s) for column(s) > > JOB_NAME,PROPNAME,PROPCLASS in statement [INSERT INTO SQOOP_SESSIONS > > (propval, job_name, propclass, propname) VALUES (?, ?, ?, ?)] > > at org.hsqldb.jdbc.Util.throwError(Unknown Source) > > at org.hsqldb.jdbc.jdbcPreparedStatement.executeUpdate(Unknown > > Source) > > at > org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.setV0Property( > > HsqldbJobStorage.java:707) > > at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage. > > createInternal(HsqldbJobStorage.java:415) > > ... 12 more > > > > ------------------- >
