[ 
https://issues.apache.org/jira/browse/HADOOP-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White resolved HADOOP-4013.
-------------------------------

    Resolution: Duplicate

Duplicate of HADOOP-6254.

> SocketException with S3 native file system causes job to fail
> -------------------------------------------------------------
>
>                 Key: HADOOP-4013
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4013
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.0
>            Reporter: Karl Anderson
>
> I'm running Hadoop 0.18.0 with a Amazon S3 native filesystem input (s3n URL 
> given for input on the commandline).  I'm having mapper tasks die, which is 
> killing the job.  The error is "java.net.SocketException: Connection reset".
> I'm using streaming, but my code isn't using any S3 classes itself, this is 
> being done for me by the input reader.  Traceback from the task details, and 
> my invocation, are appended.
> Several mapper tasks complete before this happens, and I've had other jobs 
> work with input from smaller Amazon S3 buckets for the same account.  So this 
> looks like a connectivity issue, where the input reader should realize that 
> it's calling a web service and try again.
> Traceback:
> java.net.SocketException: Connection reset
>       at java.net.SocketInputStream.read(SocketInputStream.java:168)
>       at 
> com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
>       at 
> com.sun.net.ssl.internal.ssl.InputRecord.readV3Record(InputRecord.java:405)
>       at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:360)
>       at 
> com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:789)
>       at 
> com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:746)
>       at 
> com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
>       at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>       at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>       at 
> org.apache.commons.httpclient.ContentLengthInputStream.read(ContentLengthInputStream.java:169)
>       at java.io.FilterInputStream.read(FilterInputStream.java:116)
>       at 
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:107)
>       at 
> org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:72)
>       at 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:123)
>       at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:98)
>       at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>       at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
>       at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>       at java.io.DataInputStream.read(DataInputStream.java:132)
>       at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>       at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>       at 
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamXmlRecordReader.java:248)
>       at 
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchEnd(StreamXmlRecordReader.java:123)
>       at 
> org.apache.hadoop.streaming.StreamXmlRecordReader.next(StreamXmlRecordReader.java:91)
>       at 
> org.apache.hadoop.streaming.StreamXmlRecordReader.next(StreamXmlRecordReader.java:46)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:165)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> Part of my Hadoop invocation (connection info censored, lots of -file 
> includes removed)
> hadoop jar 
> /usr/local/hadoop-0.18.0/contrib/streaming/hadoop-0.18.0-streaming.jar 
> -mapper ./spinn3r_vector_mapper.py -input s3n://<key:key>@<bucket>/ -output 
> vectors -jobconf mapred.output.compress=false -inputreader 
> org.apache.hadoop.streaming.StreamXmlRecordReader,begin=<item>,end=</item> 
> -jobconf mapred.map.tasks=128 -jobconf mapred.reduce.tasks=0 [...]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to