Hi Andre,
contributions are greatly appreciated! Just open a JIRA [1], enhance the 
troubleshooting guide [2] and attach the patch!

Jarcec

Links:
1: https://issues.apache.org/jira/browse/SQOOP
2: https://github.com/apache/sqoop/blob/trunk/src/docs/user/troubleshooting.txt

On Tue, Jul 16, 2013 at 06:47:46AM +1000, Andre Araujo wrote:
> Hi, Jarek,
> 
> I believe it certainly would. Is that something I can help with?
> 
> Andre
> 
> 
> 
> On 16 July 2013 02:55, Jarek Jarcec Cecho <[email protected]> wrote:
> 
> > Hi Andre,
> > thank you for the blog post. Do you think that it would be helpful to put
> > the additional information into the Sqoop Troubleshooting guide?
> >
> >
> > http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_oracle_connection_reset_errors
> >
> > Jarcec
> >
> > On Mon, Jul 15, 2013 at 01:12:27PM +1000, Andre Araujo wrote:
> > > Thanks, David.
> > >
> > > My blog post is pending revision and should be published soon. I'll post
> > > the final link when it does.
> > > For the time being, please see below a copy of it without the formatting.
> > > What worked for me was a combination of two things:
> > >
> > >    - passing the "-D
> > >    mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom""
> > >    parameter to sqoop
> > >    - setting the java.security.egd parameter in the HADOOP_OPTS variable,
> > >    so that it was passed to "${HADOOP_COMMON_HOME}/bin/hadoop
> > >     org.apache.sqoop.Sqoop"
> > >
> > > Regards,
> > > Andre
> > >
> > > -----------------------------------
> > >
> > > I’ve been using Sqoop to load data into HDFS from Oracle. I’m using
> > version
> > > 1.4.3 of Sqoop, running on a Linux machine and using the Oracle JDBC
> > driver
> > > with JDK 1.6.
> > >
> > > I was getting intermittent connection resets when trying to import data.
> > > After much troubleshooting, I eventually found the problem to be related
> > to
> > > a known issue with the JDBC driver and found a way to work around it,
> > which
> > > is described in the post
> > >
> > >
> > > The problem
> > >
> > > I noticed that when I was importing data at times where the machine I was
> > > running the sqoop client at was mostly idle, everything would run just
> > > fine. However, at times when others started to work on the same machine
> > and
> > > it became a bit busier, I would start to get the errors below
> > > intermittently:
> > >
> > > [araujo@client ~]$ sqoop import --connect jdbc:oracle:thin:user/pwd@host
> > /orcl
> > > -m 1 --query 'select 1 from dual where $CONDITIONS' --target-dir test
> > > 13/07/12 09:35:39 INFO manager.SqlManager: Using default fetchSize of
> > 1000
> > > 13/07/12 09:35:39 INFO tool.CodeGenTool: Beginning code generation
> > > 13/07/12 09:37:53 ERROR manager.SqlManager: Error executing statement:
> > > java.sql.SQLRecoverableException: IO Error: Connection reset
> > >  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:467)
> > > at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:546)
> > >         ...
> > > Caused by: java.net.SocketException: Connection reset
> > > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
> > > at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> > >  ... 24 more
> > > After some troubleshooting and googling, I found that the problem seemed
> > to
> > > be related to the issue described in the following articles:
> > >
> > >
> > http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue/
> > > https://forums.oracle.com/message/3701989/
> > >
> > > Confirming the problem
> > >
> > > To ensure the problem was the same as the one described in the articles,
> > > and not something else intrinsic to Sqoop, I created a small Java program
> > > that simply connected to the database. I was able to reproduce the issue
> > > using it:
> > >
> > > [araujo@client TestConn]$ time java TestConn
> > > Exception in thread "main" java.sql.SQLRecoverableException: IO Error:
> > > Connection reset
> > > ...
> > > Caused by: java.net.SocketException: Connection reset
> > > ...
> > > ... 8 more
> > >
> > > real 1m20.481s
> > > user 0m0.491s
> > > sys 0m0.051s
> > > The workaround suggested in the articles also worked:
> > >
> > > [araujo@client TestConn]$ time java
> > > -Djava.security.egd=file:/dev/../dev/urandom TestConn
> > > Connection successful!
> > >
> > > real 0m0.419s
> > > user 0m0.498s
> > > sys 0m0.036s
> > > Applying the fix to Sqoop
> > >
> > > It took me a while to figure out how to use the workaround above with
> > > Sqoop. Many tentatives to specify the parameter in the Sqoop command
> > line,
> > > in many different forms, didn’t work as expected.
> > >
> > > The articles mention that the java.security.egd parameter can be
> > centrally
> > > set in the $JAVA_HOME/jre/lib/security/java.security file. Unfortunately,
> > > this didn’t work for me. Using strace, I confirmed that Sqoop was
> > actually
> > > reading the java.security file but the setting just didn’t take effect. I
> > > couldn’t figure out why not and eventually gave up that alternative.
> > >
> > > After a bit more of stracing and troubleshooting, though, I finally
> > figured
> > > a way out.
> > >
> > > Sqoop seems to use the JDBC driver in two different ways:
> > >
> > > First, it connects to the Oracle database directly. It does that to
> > gather
> > > more information about the tables (or query) from where the data is
> > > extracted and generate the map reduce job that it will run.
> > > Second, the map reduce job generated by Sqoop uses the JDBC driver to
> > > connect to the database and perform the actual data import.
> > > I was hitting the problem in the first case above, but I believe in both
> > > cases there’s a potential for the problem to occur. So, ideally, we
> > should
> > > apply the workaround to both cases.
> > >
> > > The Sqoop documentation clearly gives us an option to address the second
> > > case: using the following parameter to Sqoop allows us to pass Java
> > command
> > > line options to the map reduce job:
> > >
> > > sqoop import -D
> > > mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom" ...
> > > Even though I couldn’t fully prove the above, since I couldn’t
> > consistently
> > > reproduce the problem for the map reduce tasks, I believe (and hope) it
> > > should work well.
> > >
> > > The Sqoop direct connection to Oracle
> > >
> > > The problem with the direct connection from Sqoop to Oracle, though,
> > wasn’t
> > > resolved by that option. Trying to pass the
> > > “-Djava.security.egd=file:/dev/../dev/urandom” option directly to Sqoop
> > > didn’t work either.
> > >
> > > After digging up a bit I found that the sqoop command eventually calls
> > > ${HADOOP_COMMON_HOME}/bin/hadoop to execute the org.apache.sqoop.Sqoop
> > > class. Since the hadoop executable is used, it accepts Java command line
> > > options through the HADOOP_OPTS environment variable.
> > >
> > > A quick test confirmed that the case was closed:
> > >
> > > [araujo@client STimport]$ export
> > > HADOOP_OPTS=-Djava.security.egd=file:/dev/../dev/urandom
> > > [araujo@client STimport]$ sqoop import -D
> > > mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom"
> > > --connect jdbc:oracle:thin:user/pwd@host/orcl -m 1 --query 'select 1
> > from
> > > dual where $CONDITIONS' --target-dir test
> > > 13/07/12 10:08:17 INFO manager.SqlManager: Using default fetchSize of
> > 1000
> > > 13/07/12 10:08:17 INFO tool.CodeGenTool: Beginning code generation
> > > 13/07/12 10:08:18 INFO manager.OracleManager: Time zone has been set to
> > GMT
> > > 13/07/12 10:08:18 INFO manager.SqlManager: Executing SQL statement:
> > select
> > > 1 from dual where  (1 = 0)
> > > 13/07/12 10:08:18 INFO manager.SqlManager: Executing SQL statement:
> > select
> > > 1 from dual where  (1 = 0)
> > > 13/07/12 10:08:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
> > >
> > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/bin/../lib/hadoop-0.20-mapreduce
> > > 13/07/12 10:08:18 INFO orm.CompilationManager: Found hadoop core jar at:
> > >
> > /opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/bin/../lib/hadoop-0.20-mapreduce/hadoop-core.jar
> > > Note:
> > >
> > /tmp/sqoop-araujo/compile/02ed1ccf04debf4769910b93ca67d2ba/QueryResult.java
> > > uses or overrides a deprecated API.
> > > Note: Recompile with -Xlint:deprecation for details.
> > > 13/07/12 10:08:19 INFO orm.CompilationManager: Writing jar file:
> > >
> > /tmp/sqoop-araujo/compile/02ed1ccf04debf4769910b93ca67d2ba/QueryResult.jar
> > > 13/07/12 10:08:19 INFO mapreduce.ImportJobBase: Beginning query import.
> > > 13/07/12 10:08:19 WARN mapred.JobClient: Use GenericOptionsParser for
> > > parsing the arguments. Applications should implement Tool for the same.
> > > 13/07/12 10:08:21 INFO mapred.JobClient: Running job:
> > job_201306141710_0075
> > > 13/07/12 10:08:22 INFO mapred.JobClient:  map 0% reduce 0%
> > > 13/07/12 10:08:31 INFO mapred.JobClient:  map 100% reduce 0%
> > > 13/07/12 10:08:33 INFO mapred.JobClient: Job complete:
> > job_201306141710_0075
> > > 13/07/12 10:08:33 INFO mapred.JobClient: Counters: 23
> > > 13/07/12 10:08:33 INFO mapred.JobClient:   File System Counters
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of bytes read=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of bytes
> > > written=179438
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of read
> > > operations=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of large read
> > > operations=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of write
> > > operations=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of bytes
> > read=87
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of bytes
> > written=2
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of read
> > > operations=1
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of large read
> > > operations=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of write
> > > operations=1
> > > 13/07/12 10:08:33 INFO mapred.JobClient:   Job Counters
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Launched map tasks=1
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all maps
> > > in occupied slots (ms)=7182
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all
> > > reduces in occupied slots (ms)=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all maps
> > > waiting after reserving slots (ms)=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all
> > > reduces waiting after reserving slots (ms)=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:   Map-Reduce Framework
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Map input records=1
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Map output records=1
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Input split bytes=87
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Spilled Records=0
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     CPU time spent (ms)=940
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Physical memory (bytes)
> > > snapshot=236580864
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Virtual memory (bytes)
> > > snapshot=34998603776
> > > 13/07/12 10:08:33 INFO mapred.JobClient:     Total committed heap usage
> > > (bytes)=1013252096
> > > 13/07/12 10:08:33 INFO mapreduce.ImportJobBase: Transferred 2 bytes in
> > > 14.4198 seconds (0.1387 bytes/sec)
> > > 13/07/12 10:08:33 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> > > - See more at:
> > >
> > http://www.pythian.com/blog/?p=56405&preview=true&preview_id=56405&preview_nonce=7dfd5e1733#sthash.mUjOwAdP.dpuf
> > >
> > >
> > >
> > >
> > >
> > > On 15 July 2013 12:06, David Robson <[email protected]>
> > wrote:
> > >
> > > >  Hi Andre,****
> > > >
> > > > ** **
> > > >
> > > > I sent this reply earlier – but I don’t think it went through to the
> > list
> > > > – so sorry if this ends up being a repost (I also BCCd you).****
> > > >
> > > > ** **
> > > >
> > > > I had a similar issue – you can see more details here:
> > > > https://questmos.jira.com/browse/ORAOOP-2****
> > > >
> > > > ** **
> > > >
> > > > Basically if you use OraOop 1.6 – it should set this for you
> > > > automatically. The only problem is if someone has set the parameter as
> > > > final on the server – then obviously the job can’t override it.****
> > > >
> > > > ** **
> > > >
> > > > I think the reason it’s not working for you is you need to set it as
> > > > follows:****
> > > >
> > > > ** **
> > > >
> > > > -Dmapred.child.java.opts="-Djava.security.egd=file:///dev/urandom"****
> > > >
> > > > ** **
> > > >
> > > > You could try that – or download OraOop from here:
> > > > http://www.toadworld.com/m/freeware/565.aspx****
> > > >
> > > > ** **
> > > >
> > > > It is Apache 2.0 licensed and the source code is available
> > > > https://github.com/QuestSoftwareTCD/OracleSQOOPconnector****
> > > >
> > > > ** **
> > > >
> > > > *David Robson*****
> > > >
> > > > Software Developer****
> > > >
> > > > *Dell** *| R&D, Quest Software****
> > > >
> > > > *office *+61 3 9811 8082****
> > > >
> > > > ** **
> > > >
> > > > Quest Software is now part of Dell****
> > > >
> > > > ** **
> > > >
> > > > *From:* Andre Araujo [mailto:[email protected]]
> > > > *Sent:* Monday, 15 July 2013 10:30 AM
> > > > *To:* [email protected]
> > > > *Subject:* Re: Intermittent problems with sqoop using Oracle JDBC
> > driver**
> > > > **
> > > >
> > > > ** **
> > > >
> > > > Hi, Jarek,****
> > > >
> > > > ** **
> > > >
> > > > I'll fix that and reply to this email when the link is working.****
> > > >
> > > > ** **
> > > >
> > > > Thanks for pointing that out.****
> > > >
> > > > ** **
> > > >
> > > > Andre****
> > > >
> > > > ** **
> > > >
> > > > On 13 July 2013 01:35, Jarek Jarcec Cecho <[email protected]>
> > wrote:****
> > > >
> > > > Hi Andre,
> > > > thank you for sharing the solution with us! The link do not seem to be
> > > > valid, would you mind checking if it's correct?
> > > >
> > > > Jarcec****
> > > >
> > > >
> > > > On Sat, Jul 13, 2013 at 01:05:19AM +1000, Andre Araujo wrote:
> > > > > Hi, all,
> > > > >
> > > > > I've found the answer for my own question.
> > > > > The solution is explained here:
> > > > > http://www.pythian.com/blog/?p=56405&preview=true
> > > > >
> > > > > Regards,
> > > > > Andre
> > > > >
> > > > >
> > > > >
> > > > > On 12 July 2013 15:57, Andre Araujo <[email protected]> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > While using sqoop to import data from an Oracle database I was
> > noticing
> > > > > > intermittent problems with sqoop hanging. After some
> > troubleshooting I
> > > > > > narrowed the problem down to the issue described in the following
> > > > pages:
> > > > > >
> > > > > >
> > > > > >
> > > >
> > http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connect=
> > > > > > ion-issue
> > > > > > https://forums.oracle.com/message/3701989
> > > > > >
> > > > > > The problem seems to be directly related to the Oracle JDBC
> > driver. As
> > > > per
> > > > > > the articles above, I can tell the driver to use /dev/urandom by
> > using
> > > > the
> > > > > > option -Djava.security.egd. I tested it with a standalone Java
> > program
> > > > > > connecting to the database using the JDBC driver and it works well.
> > > > > >
> > > > > > sqoop, though, uses JDBC in two ways. It connects directly to the
> > > > database
> > > > > > thru JDBC to gather details about the table/query and generate the
> > > > mapred
> > > > > > job. It also uses the driver indirectly in the mapred tasks.
> > > > > >
> > > > > > We can use the -D
> > mapred.child.java.opts=3D"-Djava.security.egd=3D..."
> > > > > > opti=
> > > > > > on
> > > > > > of sqoop to tell the mapred tasks to use the option. However, I
> > haven't
> > > > > > found any way to tell sqoop to use it while connecting directly to
> > the
> > > > > > database. Because of this, it hangs soon after I start it and even
> > > > before
> > > > > > it kicks off the mapred job.
> > > > > >
> > > > > > I also tried to pass the java.security.egd option directly to sqoop
> > > > with
> > > > > > the following command line but it didn't work:
> > > > > >
> > > > > > -D java.security.egd=3D/dev/urandom -D mapred.child.java.opts=3D"-D
> > > > > > java.security.egd=3D/dev/urandom"
> > > > > >
> > > > > > I added the option to the java.security file, but for some reason
> > it
> > > > also
> > > > > > didn't sort effect for sqoop.
> > > > > >
> > > > > > Any idea on a workaround/solution for this?
> > > > > >
> > > > > > Thanks,
> > > > > > Andre
> > > > > >
> > > > > > --
> > > > > > André Araújo
> > > > > > Database Administrator / SDM
> > > > > > The Pythian Group - Australia - www.pythian.com
> > > > > >
> > > > > > Office (calls from within Australia): 1300 366 021 x1270****
> > > >
> > > > > > Office (international): +61 2 9191 7427 
> > > > > > x270<%2B61%202%209191%207427%20%20x270>*OR*
> > +1
> > > > 613 565 8696 x1270 <%2B1%20613%20565%208696%20%20%20x1270>****
> > > >
> > > > > > Mobile: +61 410 323 559
> > > > > > Fax: +61 2 9805 0544
> > > > > > IM: pythianaraujo @ AIM/MSN/Y! or [email protected] @ GTalk
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > André Araújo
> > > > > Database Administrator / SDM
> > > > > The Pythian Group - Australia - www.pythian.com
> > > > >
> > > > > Office (calls from within Australia): 1300 366 021 x1270****
> > > >
> > > > > Office (international): +61 2 9191 7427 
> > > > > x270<%2B61%202%209191%207427%20%20x270>*OR*
> > +1
> > > > 613 565 8696 x1270 <%2B1%20613%20565%208696%20%20%20x1270>****
> > > >
> > > > > Mobile: +61 410 323 559
> > > > > Fax: +61 2 9805 0544
> > > > > IM: pythianaraujo @ AIM/MSN/Y! or [email protected] @ GTalk
> > > > >
> > > > > --
> > > > >
> > > > >****
> > > >
> > > > > --
> > > > >
> > > > >
> > > > >****
> > > >
> > > >
> > > >
> > > > ****
> > > >
> > > > ** **
> > > >
> > > > -- ****
> > > >
> > > > André Araújo
> > > > Database Administrator / SDM
> > > > The Pythian Group - Australia - www.pythian.com****
> > > >
> > > > Office (calls from within Australia): 1300 366 021 x1270
> > > > Office (international): +61 2 9191 7427 x270 *OR* +1 613 565 8696
> > x1270
> > > > Mobile: +61 410 323 559
> > > > Fax: +61 2 9805 0544
> > > > IM: pythianaraujo @ AIM/MSN/Y! or [email protected] @ GTalk****
> > > >
> > > > ** **
> > > >
> > > > ** **
> > > >
> > > > --****
> > > >
> > > > ** **
> > > >
> > >
> > >
> > >
> > > --
> > > André Araújo
> > > Database Administrator / SDM
> > > The Pythian Group - Australia - www.pythian.com
> > >
> > > Office (calls from within Australia): 1300 366 021 x1270
> > > Office (international): +61 2 9191 7427  x270 *OR* +1 613 565 8696
> > x1270
> > > Mobile: +61 410 323 559
> > > Fax: +61 2 9805 0544
> > > IM: pythianaraujo @ AIM/MSN/Y! or [email protected] @ GTalk
> > >
> > > --
> > >
> > >
> > > --
> > >
> > >
> > >
> >
> 
> 
> 
> -- 
> André Araújo
> Database Administrator / SDM
> The Pythian Group - Australia - www.pythian.com
> 
> Office (calls from within Australia): 1300 366 021 x1270
> Office (international): +61 2 9191 7427  x270 *OR* +1 613 565 8696   x1270
> Mobile: +61 410 323 559
> Fax: +61 2 9805 0544
> IM: pythianaraujo @ AIM/MSN/Y! or [email protected] @ GTalk
> 
> -- 
> 
> 
> --
> 
> 
> 

Attachment: signature.asc
Description: Digital signature

Reply via email to