[
https://issues.apache.org/jira/browse/HADOOP-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642770#action_12642770
]
Elias Torres commented on HADOOP-4013:
--------------------------------------
Tom,
Do you know if anybody is going to followup on this issue? Let me know if I
should work on a patch. I think I see what you're saying.
> SocketException with S3 native file system causes job to fail
> -------------------------------------------------------------
>
> Key: HADOOP-4013
> URL: https://issues.apache.org/jira/browse/HADOOP-4013
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 0.18.0
> Reporter: Karl Anderson
>
> I'm running Hadoop 0.18.0 with a Amazon S3 native filesystem input (s3n URL
> given for input on the commandline). I'm having mapper tasks die, which is
> killing the job. The error is "java.net.SocketException: Connection reset".
> I'm using streaming, but my code isn't using any S3 classes itself, this is
> being done for me by the input reader. Traceback from the task details, and
> my invocation, are appended.
> Several mapper tasks complete before this happens, and I've had other jobs
> work with input from smaller Amazon S3 buckets for the same account. So this
> looks like a connectivity issue, where the input reader should realize that
> it's calling a web service and try again.
> Traceback:
> java.net.SocketException: Connection reset
> at java.net.SocketInputStream.read(SocketInputStream.java:168)
> at
> com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
> at
> com.sun.net.ssl.internal.ssl.InputRecord.readV3Record(InputRecord.java:405)
> at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:360)
> at
> com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:789)
> at
> com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:746)
> at
> com.sun.net.ssl.internal.ssl.AppInputStream.read(AppInputStream.java:75)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at
> org.apache.commons.httpclient.ContentLengthInputStream.read(ContentLengthInputStream.java:169)
> at java.io.FilterInputStream.read(FilterInputStream.java:116)
> at
> org.apache.commons.httpclient.AutoCloseInputStream.read(AutoCloseInputStream.java:107)
> at
> org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:72)
> at
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:123)
> at
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:98)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at java.io.DataInputStream.read(DataInputStream.java:132)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.fastReadUntilMatch(StreamXmlRecordReader.java:248)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.readUntilMatchEnd(StreamXmlRecordReader.java:123)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.next(StreamXmlRecordReader.java:91)
> at
> org.apache.hadoop.streaming.StreamXmlRecordReader.next(StreamXmlRecordReader.java:46)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:165)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> Part of my Hadoop invocation (connection info censored, lots of -file
> includes removed)
> hadoop jar
> /usr/local/hadoop-0.18.0/contrib/streaming/hadoop-0.18.0-streaming.jar
> -mapper ./spinn3r_vector_mapper.py -input s3n://<key:key>@<bucket>/ -output
> vectors -jobconf mapred.output.compress=false -inputreader
> org.apache.hadoop.streaming.StreamXmlRecordReader,begin=<item>,end=</item>
> -jobconf mapred.map.tasks=128 -jobconf mapred.reduce.tasks=0 [...]
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.