Sure.
This is my command. When I run 2 commands in parallel , I get the exception as
mentioned below.
sqoop import --connect jdbc:oracle:thin:@<<ORACLE DB DETAILS>> --table
<Table_name> --where "date between '01-JAN-2013' and '30-JAN-2013'" -m 1
--hive-import --hive-table <hive tablename> --compression-codec
org.apache.hadoop.io.compress.SnappyCodec --null-string '\\N' --null-non-string
'\\N' --hive-drop-import-delims;
...
...
..
sqoop import --connect jdbc:oracle:thin:@<<ORACLE DB DETAILS>> --table
<Table_name> --where "date between '01-DEC-2013' and '31-DEC-2013'" -m 1
--hive-import --hive-table <hive tablename> --compression-codec
org.apache.hadoop.io.compress.SnappyCodec --null-string '\\N' --null-non-string
'\\N' --hive-drop-import-delims;
Exception:
14/08/14 12:04:57 ERROR tool.ImportTool: Encountered IOException running import
job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
<SCHEMA>.<TABLENAME> already exists
at
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:132)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:987)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:948)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:948)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:612)
at
org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
at
org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
at
org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:614)
at
org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:436)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:506)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
-----Original Message-----
From: Jarek Jarcec Cecho [mailto:[email protected]] On Behalf Of Jarek Jarcec
Cecho
Sent: Thursday, August 14, 2014 11:41 AM
To: [email protected]
Subject: Re: Sqoop Import parallel sessions - Question
It would be helpful if you could share your entire Sqoop commands and the exact
exception with it's stack trace.
Jarcec
On Aug 14, 2014, at 7:57 AM, Sethuramaswamy, Suresh
<[email protected]> wrote:
> Team,
>
> We had to initiate Sqoop import for a month old records in a session,
> similarly I need to initiate 12 such statements in parallel in order to read
> 1 year worth of data, while I do this,
>
> I keep getting the error <SCHEMA>.<TABLENAME> folder already exists. This is
> because of all these sessions being initiated with same uid and the mapred
> temporary hdfs folder under the user's home directory until it completes.
>
> Is there a better option for me to accomplish .?
>
>
> Thanks
> Suresh Sethuramaswamy
>
>
>
> ==============================================================================
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ==============================================================================
===============================================================================
Please access the attached hyperlink for an important electronic communications
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
===============================================================================