Re: Fwd: Sqoop export not working when using "update-key"

Gwen Shapira Fri, 18 Jul 2014 09:14:08 -0700

If we omit --columns the source file has to match the DB in number of
columns, types and order.


On Fri, Jul 18, 2014 at 8:23 AM, Leonardo Brambilla
<[email protected]> wrote:
> Hello some more findings, could this bug be related to my problem?
> https://issues.apache.org/jira/browse/SQOOP-824
> I know it says it's fixed since 1.4.3 but maybe that brought some other
> case.
> I still don't understand how to run export command without specifying the
> --columns parameter, can you tell me what is the default behavior when you
> omit --columns? Does the source file need to have the same column order than
> the target table?
>
> Thanks
>
>
> On Fri, Jul 18, 2014 at 12:46 AM, Leonardo Brambilla
> <[email protected]> wrote:
>>
>> I think I found something. The java class generated when using update-key
>> differs from the one without update-key. The one that throws exception is
>> missing to write the fields that are not specified in the update-key. I also
>> see (with --verbose) that when using --update-key there is an extra debug
>> line that says
>> 14/07/17 22:53:27 DEBUG orm.ClassWriter: db write column order:
>> 14/07/17 22:53:27 DEBUG orm.ClassWriter:   SEARCH_DATE
>>
>> Below is the method generated for the command without --update-key
>>
>>   public int write(PreparedStatement __dbStmt, int __off) throws
>> SQLException {
>>     JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93,
>> __dbStmt);
>>     JdbcWritableBridge.writeString(SEARCH_TYPE, 2 + __off, 12, __dbStmt);
>>     JdbcWritableBridge.writeString(USER_AGENT, 3 + __off, 12, __dbStmt);
>>     JdbcWritableBridge.writeString(SRCH_KEYWORD, 4 + __off, 12, __dbStmt);
>>     JdbcWritableBridge.writeBigDecimal(SRCH_COUNT, 5 + __off, 2,
>> __dbStmt);
>>     return 5;
>>   }
>>
>> Below is the one generated for the command with --update-key
>>   public int write(PreparedStatement __dbStmt, int __off) throws
>> SQLException {
>>     JdbcWritableBridge.writeTimestamp(SEARCH_DATE, 1 + __off, 93,
>> __dbStmt);
>>     return 1;
>>   }
>>
>> I tried to force export to use the properly generated class with
>> parameters "jar-file" and "class-name" but that didn't work, if like those
>> params are not allowed in the export command. This is what I tried to force
>> using the properly generated source
>> sqoop export \
>> --connect jdbc:oracle:thin:@ddb04.local.com:1541/test04 \
>> --update-key "SEARCH_DATE" \
>> --columns $columns \
>> --table $table --username $user --password $passwd \
>> --fields-terminated-by "=" --export-dir $exportDir
>> --jar-file SEARCH_TABLE.jar --class-name SEARCH_TABLE
>>
>>
>>
>> On Thu, Jul 17, 2014 at 5:04 PM, Leonardo Brambilla
>> <[email protected]> wrote:
>>>
>>> Yes, the update-key is a subset of columns.
>>>
>>>
>>> On Thu, Jul 17, 2014 at 4:16 PM, Gwen Shapira <[email protected]>
>>> wrote:
>>>>
>>>> Does the update column appear in $columns? It should be in there.
>>>>
>>>>
>>>> On Thu, Jul 17, 2014 at 10:48 AM, Leonardo Brambilla
>>>> <[email protected]> wrote:
>>>>>
>>>>> Hi Gwen, thank you for replying.
>>>>>
>>>>> I went to the data node, the userlogs and all I found in syslog file is
>>>>> what I already posted:
>>>>> 2014-07-17 10:19:09,280 INFO org.apache.hadoop.util.NativeCodeLoader:
>>>>> Loaded the native-hadoop library
>>>>> 2014-07-17 10:19:09,700 INFO org.apache.hadoop.util.ProcessTree: setsid
>>>>> exited with exit code 0
>>>>> 2014-07-17 10:19:09,706 INFO org.apache.hadoop.mapred.Task:  Using
>>>>> ResourceCalculatorPlugin :
>>>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@34c3a7c0
>>>>> 2014-07-17 10:19:10,266 INFO
>>>>> org.apache.sqoop.mapreduce.AutoProgressMapper: Auto-progress thread is
>>>>> finished. keepGoing=false
>>>>> 2014-07-17 10:19:10,476 INFO
>>>>> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
>>>>> with mapRetainSize=-1 and reduceRetainSize=-1
>>>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>>>>> Initialized cache for UID to User mapping with a cache timeout of 14400
>>>>> seconds.
>>>>> 2014-07-17 10:19:10,537 INFO org.apache.hadoop.io.nativeio.NativeIO:
>>>>> Got UserName elance for UID 666 from the native implementation
>>>>> 2014-07-17 10:19:10,539 ERROR
>>>>> org.apache.hadoop.security.UserGroupInformation: 
>>>>> PriviledgedActionException
>>>>> as:elance cause:java.io.IOException: java.sql.SQLException: Missing IN or
>>>>> OUT parameter at index:: 2
>>>>> 2014-07-17 10:19:10,540 WARN org.apache.hadoop.mapred.Child: Error
>>>>> running child
>>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter
>>>>> at index:: 2
>>>>> at
>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>> at
>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>> at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>> Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>>>>> index:: 2
>>>>> at
>>>>> oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>> at
>>>>> oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>> at
>>>>> oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>> at
>>>>> org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>> at
>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>> at
>>>>> org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>> ... 8 more
>>>>> 2014-07-17 10:19:10,543 INFO org.apache.hadoop.mapred.Task: Runnning
>>>>> cleanup for the task
>>>>>
>>>>> There isn't more data than that.
>>>>> Can you please check my sqoop command and validate that I'm using the
>>>>> proper arguments? The argument "--columns" is used in export to tell sqoop
>>>>> the order in which it should read the columns from the file right?
>>>>> Does the last column need to have delimiter too?
>>>>> The source file should be ok, have in mind that it works for insert but
>>>>> fails when I add the parameter --update-key
>>>>>
>>>>> Thanks
>>>>> Leo
>>>>>
>>>>>
>>>>> On Thu, Jul 17, 2014 at 1:52 PM, Gwen Shapira <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>> I can confirm that Sqoop export update works on Oracle, both with and
>>>>>> without Oraoop.
>>>>>>
>>>>>> The specific exception you are getting indicates that Oracle expects
>>>>>> at least 4 columns of data and the HDFS file may have less than that.
>>>>>>
>>>>>> Can you double check that the columns in Oracle and your data file
>>>>>> match? And that you are using a correct delimiter?
>>>>>>
>>>>>> And as Jarcec said, if you have access to the Task Tracker user logs
>>>>>> for one of the mappers, you'll have much more details to work with - for
>>>>>> example the specific line that failed.
>>>>>>
>>>>>> Gwen
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 17, 2014 at 7:44 AM, Leonardo Brambilla
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hello Jarek,
>>>>>>>
>>>>>>> I'm getting back to this issue, I'm trying to fix it by using Oraoop
>>>>>>> but that doesn't avoid the exception:
>>>>>>> java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>>> parameter at index:: 4
>>>>>>>
>>>>>>> I ran a couple of tests and I can tell that the following command
>>>>>>> works to insert new rows:
>>>>>>> sqoop export \
>>>>>>> --connect jdbc:oracle:thin:@ddb04.local.com:1541/test04 \
>>>>>>> --columns $columns \
>>>>>>> --table $table --username $user --password $passwd \
>>>>>>> --fields-terminated-by "=" --export-dir $exportDir
>>>>>>>
>>>>>>> But the following command (just added --update-key) throws an
>>>>>>> exception:
>>>>>>> sqoop export \
>>>>>>> --connect jdbc:oracle:thin:@ddb04.local.com:1541/test04 \
>>>>>>> --update-key "SEARCH_DATE" \
>>>>>>> --columns $columns \
>>>>>>> --table $table --username $user --password $passwd \
>>>>>>> --fields-terminated-by "=" --export-dir $exportDir
>>>>>>>
>>>>>>> DB is oracle 11.2.0.2.0
>>>>>>> Sqoop is 1.4.4
>>>>>>> Java 1.7
>>>>>>> Oraoop 1.6
>>>>>>> Oracle jdbc driver "ojdb6c.jar" implementation version 11.2.0.3.0
>>>>>>>
>>>>>>> Like I said before, all the log I can get from the failed task I
>>>>>>> already posted here.
>>>>>>>
>>>>>>> Can you confirm that Sqoop export update works on Oracle DBs?
>>>>>>> Thanks in advance
>>>>>>> Leo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, May 16, 2014 at 4:51 PM, Jarek Jarcec Cecho
>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hi Leonardo,
>>>>>>>> sadly the Sqoop output might not be that much helpful in this case,
>>>>>>>> could you please share with us the failed map task log?
>>>>>>>>
>>>>>>>> The easiest way how to get it on Hadoop 1.x is to open the job
>>>>>>>> tracker webinterface, find the failed Sqoop job and navigate to the 
>>>>>>>> failed
>>>>>>>> map tasks.
>>>>>>>>
>>>>>>>> Jarcec
>>>>>>>>
>>>>>>>> On Tue, May 13, 2014 at 11:36:34AM -0300, Leonardo Brambilla wrote:
>>>>>>>> > Hi Jarek, find below the full sqoop generated log. I went through
>>>>>>>> > all the
>>>>>>>> > Cluster's nodes for this task logs and there is nothing more than
>>>>>>>> > this same
>>>>>>>> > error. I really don't know what else to look for.
>>>>>>>> >
>>>>>>>> > Thanks
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
>>>>>>>> > Please set $HBASE_HOME to the root of your HBase installation.
>>>>>>>> > 14/05/13 10:26:41 WARN tool.BaseSqoopTool: Setting your password
>>>>>>>> > on the
>>>>>>>> > command-line is insecure. Consider using -P instead.
>>>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Using default fetchSize
>>>>>>>> > of 1000
>>>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been
>>>>>>>> > set to GMT
>>>>>>>> > 14/05/13 10:26:41 INFO tool.CodeGenTool: Beginning code generation
>>>>>>>> > 14/05/13 10:26:41 INFO manager.OracleManager: Time zone has been
>>>>>>>> > set to GMT
>>>>>>>> > 14/05/13 10:26:41 INFO manager.SqlManager: Executing SQL
>>>>>>>> > statement: SELECT
>>>>>>>> > t.* FROM etl.EXPT_SPAM_RED_JOB t WHERE 1=0
>>>>>>>> > 14/05/13 10:26:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME
>>>>>>>> > is
>>>>>>>> > /home/elance/hadoop
>>>>>>>> > Note:
>>>>>>>> >
>>>>>>>> > /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl_EXPT_SPAM_RED_JOB.java
>>>>>>>> > uses or overrides a deprecated API.
>>>>>>>> > Note: Recompile with -Xlint:deprecation for details.
>>>>>>>> > 14/05/13 10:26:42 INFO orm.CompilationManager: Writing jar file:
>>>>>>>> >
>>>>>>>> > /tmp/sqoop-elance/compile/9f8f413ab105fbe67d985bdb29534d27/etl.EXPT_SPAM_RED_JOB.jar
>>>>>>>> > 14/05/13 10:26:42 INFO mapreduce.ExportJobBase: Beginning export
>>>>>>>> > of
>>>>>>>> > etl.EXPT_SPAM_RED_JOB
>>>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to
>>>>>>>> > process
>>>>>>>> > : 1
>>>>>>>> > 14/05/13 10:26:43 INFO input.FileInputFormat: Total input paths to
>>>>>>>> > process
>>>>>>>> > : 1
>>>>>>>> > 14/05/13 10:26:44 INFO mapred.JobClient: Running job:
>>>>>>>> > job_201404190827_0998
>>>>>>>> > 14/05/13 10:26:45 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>>>> > 14/05/13 10:26:53 INFO mapred.JobClient:  map 25% reduce 0%
>>>>>>>> > 14/05/13 10:26:54 INFO mapred.JobClient:  map 75% reduce 0%
>>>>>>>> > 14/05/13 10:26:55 INFO mapred.JobClient: Task Id :
>>>>>>>> > attempt_201404190827_0998_m_000001_0, Status : FAILED
>>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>>>> > parameter at
>>>>>>>> > index:: 4
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>>>> >         at
>>>>>>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>>>> > Method)
>>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>>>>>>>> > index:: 4
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>>>> >         ... 8 more
>>>>>>>> >
>>>>>>>> > 14/05/13 10:27:00 INFO mapred.JobClient: Task Id :
>>>>>>>> > attempt_201404190827_0998_m_000001_1, Status : FAILED
>>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>>>> > parameter at
>>>>>>>> > index:: 4
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>>>> >         at
>>>>>>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>>>> > Method)
>>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>>>>>>>> > index:: 4
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>>>> >         ... 8 more
>>>>>>>> >
>>>>>>>> > 14/05/13 10:27:05 INFO mapred.JobClient: Task Id :
>>>>>>>> > attempt_201404190827_0998_m_000001_2, Status : FAILED
>>>>>>>> > java.io.IOException: java.sql.SQLException: Missing IN or OUT
>>>>>>>> > parameter at
>>>>>>>> > index:: 4
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:184)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>>>>>>>> >         at
>>>>>>>> > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>>>>>>>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>>> >         at java.security.AccessController.doPrivileged(Native
>>>>>>>> > Method)
>>>>>>>> >         at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>>> > Caused by: java.sql.SQLException: Missing IN or OUT parameter at
>>>>>>>> > index:: 4
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>>> >         at
>>>>>>>> >
>>>>>>>> > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:181)
>>>>>>>> >         ... 8 more
>>>>>>>> >
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Job complete:
>>>>>>>> > job_201404190827_0998
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient: Counters: 20
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Job Counters
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>>>>>>>> > SLOTS_MILLIS_MAPS=30548
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by
>>>>>>>> > all
>>>>>>>> > reduces waiting after reserving slots (ms)=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total time spent by
>>>>>>>> > all maps
>>>>>>>> > waiting after reserving slots (ms)=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Rack-local map
>>>>>>>> > tasks=5
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Launched map tasks=7
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Data-local map
>>>>>>>> > tasks=2
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>>>>>>>> > SLOTS_MILLIS_REDUCES=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Failed map tasks=1
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Output Format
>>>>>>>> > Counters
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Written=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   FileSystemCounters
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     HDFS_BYTES_READ=459
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:
>>>>>>>> > FILE_BYTES_WRITTEN=189077
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   File Input Format
>>>>>>>> > Counters
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Bytes Read=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:   Map-Reduce Framework
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map input records=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Physical memory
>>>>>>>> > (bytes)
>>>>>>>> > snapshot=363053056
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Spilled Records=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     CPU time spent
>>>>>>>> > (ms)=2110
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Total committed heap
>>>>>>>> > usage
>>>>>>>> > (bytes)=553517056
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Virtual memory
>>>>>>>> > (bytes)
>>>>>>>> > snapshot=2344087552
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     Map output records=0
>>>>>>>> > 14/05/13 10:27:13 INFO mapred.JobClient:     SPLIT_RAW_BYTES=404
>>>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Transferred 459
>>>>>>>> > bytes in
>>>>>>>> > 30.0642 seconds (15.2673 bytes/sec)
>>>>>>>> > 14/05/13 10:27:13 INFO mapreduce.ExportJobBase: Exported 0
>>>>>>>> > records.
>>>>>>>> > 14/05/13 10:27:13 ERROR tool.ExportTool: Error during export:
>>>>>>>> > Export job
>>>>>>>> > failed!
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Mon, May 12, 2014 at 10:44 PM, Jarek Jarcec Cecho
>>>>>>>> > <[email protected]>wrote:
>>>>>>>> >
>>>>>>>> > > The map task log contain entire executed query and lot of
>>>>>>>> > > additional
>>>>>>>> > > information and hence it's supper useful in such cases.
>>>>>>>> > >
>>>>>>>> > > Jarcec
>>>>>>>> > >
>>>>>>>> > > On Mon, May 12, 2014 at 02:59:56PM -0300, Leonardo Brambilla
>>>>>>>> > > wrote:
>>>>>>>> > > > Hi Jarek,
>>>>>>>> > > >
>>>>>>>> > > > thanks for replying, I don't have the logs. I'll see if I can
>>>>>>>> > > > run the
>>>>>>>> > > task
>>>>>>>> > > > again and then keep the logs.
>>>>>>>> > > >
>>>>>>>> > > > Anyway, I don't remember seeing anything else than this
>>>>>>>> > > > SQLException
>>>>>>>> > > about
>>>>>>>> > > > missing parameter.
>>>>>>>> > > >
>>>>>>>> > > > Leo
>>>>>>>> > > >
>>>>>>>> > > >
>>>>>>>> > > > On Sun, May 11, 2014 at 10:59 AM, Jarek Jarcec Cecho
>>>>>>>> > > > <[email protected]
>>>>>>>> > > >wrote:
>>>>>>>> > > >
>>>>>>>> > > > > Hi Leonardo,
>>>>>>>> > > > > would you mind sharing with us task log from the failed map
>>>>>>>> > > > > task?
>>>>>>>> > > > >
>>>>>>>> > > > > Jarcec
>>>>>>>> > > > >
>>>>>>>> > > > > On Sun, May 11, 2014 at 10:33:11AM -0300, Leonardo Brambilla
>>>>>>>> > > > > wrote:
>>>>>>>> > > > > > Hello, I am struggling to make it work, what is a really
>>>>>>>> > > > > > required
>>>>>>>> > > > > feature.
>>>>>>>> > > > > >
>>>>>>>> > > > > > I have a process that daily generates new data, this data
>>>>>>>> > > > > > needs to be
>>>>>>>> > > > > > pushed to a table in Oracle, the table might already have
>>>>>>>> > > > > > same data
>>>>>>>> > > from
>>>>>>>> > > > > > previous loads. I need to avoid duplicating data on it.
>>>>>>>> > > > > > Pretty common
>>>>>>>> > > > > > scenario right? =)
>>>>>>>> > > > > >
>>>>>>>> > > > > > I am using sqoop export for this, no special arguments,
>>>>>>>> > > > > > just columns,
>>>>>>>> > > > > > fields-terminated-by, table and db connection, plus the
>>>>>>>> > > > > > argument
>>>>>>>> > > > > > "update-mode allowinsert".
>>>>>>>> > > > > >
>>>>>>>> > > > > > Now, when I also include the argument "update-key" with a
>>>>>>>> > > > > > comma
>>>>>>>> > > separated
>>>>>>>> > > > > > list of fields (which is the same for arg columns) I get
>>>>>>>> > > > > > the
>>>>>>>> > > following
>>>>>>>> > > > > > oracle driver error:
>>>>>>>> > > > > >
>>>>>>>> > > > > > 14/05/07 16:00:03 INFO mapred.JobClient: Task Id :
>>>>>>>> > > > > > attempt_201404190827_0928_m_000003_2, Status : FAILED
>>>>>>>> > > > > > java.io.IOException: Can't export data, please check task
>>>>>>>> > > > > > tracker
>>>>>>>> > > logs
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>>>>>>>> > > > > >         at
>>>>>>>> > > > >
>>>>>>>> > > > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > > org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > > java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > > javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > > org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>>>>> > > > > > Caused by: java.io.IOException: java.sql.SQLException:
>>>>>>>> > > > > > Missing IN or
>>>>>>>> > > OUT
>>>>>>>> > > > > > parameter at index:: 4
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84)
>>>>>>>> > > > > >         ... 10 more
>>>>>>>> > > > > > *Caused by: java.sql.SQLException: Missing IN or OUT
>>>>>>>> > > > > > parameter at
>>>>>>>> > > > > index:: 4*
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1844)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10213)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1362)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:149)
>>>>>>>> > > > > >         at
>>>>>>>> > > > > >
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>> > > org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:215)
>>>>>>>> > > > > >         ... 14 more
>>>>>>>> > > > > >
>>>>>>>> > > > > > I'm using Sqoop 1.4.3 with hadoop1, also tried 1.4.4 with
>>>>>>>> > > > > > same
>>>>>>>> > > result. I
>>>>>>>> > > > > > have the standard Oracle JDBC driver 6 with Java 7.
>>>>>>>> > > > > > I went through all the documentation, Sqoop user guide
>>>>>>>> > > > > > says this is
>>>>>>>> > > > > > supported for built-in connector which I understand I am
>>>>>>>> > > > > > using.
>>>>>>>> > > > > >
>>>>>>>> > > > > > Here is the full command:
>>>>>>>> > > > > > $sqoopExecutable export \
>>>>>>>> > > > > > --outdir $outdir \
>>>>>>>> > > > > > --connect $connectionString --table $table_client
>>>>>>>> > > > > > --username $dbUser
>>>>>>>> > > > > > --password $dbUserPasswd \
>>>>>>>> > > > > > --columns CLIENT_ID,EXP_ID,BUCKET_ID --update-key
>>>>>>>> > > > > > CLIENT_ID,EXP_ID,BUCKET_ID \
>>>>>>>> > > > > > --fields-terminated-by '\t' --update-mode allowinsert \
>>>>>>>> > > > > > --export-dir $dataSource_client > $sqoopLog 2>&1
>>>>>>>> > > > > >
>>>>>>>> > > > > > Can someone please shed some light on this?
>>>>>>>> > > > > > Thank you in advance.
>>>>>>>> > > > > >
>>>>>>>> > > > > > Leo
>>>>>>>> > > > >
>>>>>>>> > >
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Fwd: Sqoop export not working when using "update-key"

Reply via email to