Hi , I am getting the following error while reading the huge data from S3 and
after processing ,writing data to S3 again.
Did you find any solution for this ?
16/02/07 07:41:59 WARN scheduler.TaskSetManager: Lost task 144.2 in stage
3.0 (TID 169, ip-172-31-7-26.us-west-2.compute.internal):
java.io.IOException: exception in uploadSinglePart
at
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.uploadSinglePart(MultipartUploadOutputStream.java:248)
at
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.close(MultipartUploadOutputStream.java:469)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105)
at
org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:106)
at java.io.FilterOutputStream.close(FilterOutputStream.java:160)
at
org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:109)
at
org.apache.spark.SparkHadoopWriter.close(SparkHadoopWriter.scala:102)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1080)
at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: exception in putObject
at
com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:149)
at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy26.storeFile(Unknown Source)
at
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.uploadSinglePart(MultipartUploadOutputStream.java:245)
... 15 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The
Content-MD5 you specified did not match what we received. (Service: Amazon
S3; Status Code: 400; Error Code: BadDigest; Request ID: 5918216A5901FCC8),
S3 Extended Request ID:
QSxtYln/yXqHYpdr4BWosin/TAFsGlK1FlKfE5PcuJkNrgoblGzTNt74kEhuNcrJCRZ3mXq0oUo=
at
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
at
com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
at
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3796)
at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1482)
at
com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:140)
... 22 more
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Bad-Digest-error-while-doing-aws-s3-put-tp10036p26167.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]