[ 
https://issues.apache.org/jira/browse/HADOOP-17954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-17954.
-------------------------------------
    Resolution: Cannot Reproduce

> org.apache.spark.SparkException: Task failed while writing rows S3
> ------------------------------------------------------------------
>
>                 Key: HADOOP-17954
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17954
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: common
>    Affects Versions: 2.6.0
>            Reporter: sudarshan
>            Priority: Major
>
> I am trying to run spark job (1.6.0) which reads rows from HBASE and does 
> some transformation and finally writes to S3 .
> Some time i can notice error because of time out .
> Task is able to write to S3 but at last stage it fails 
> Here is the error details 
> Its intermittent issue but most of the time i see this error .
>  
> {code:java}
> Job aborted due to stage failure: Task 1074 in stage 1.0 failed 4 times, most 
> recent failure: Lost task 1074.3 in stage 1.0 (TID 2162, 
> abcd.ecom.bigdata.int.abcd.com, executor 18): 
> org.apache.spark.SparkException: Task failed while writing rowsJob aborted 
> due to stage failure: Task 1074 in stage 1.0 failed 4 times, most recent 
> failure: Lost task 1074.3 in stage 1.0 (TID 2162, 
> abcd.ecom.bigdata.int.abcd.com, executor 18): 
> org.apache.spark.SparkException: Task failed while writing rows at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:417)
>  at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:148)
>  at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:148)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at 
> org.apache.spark.scheduler.Task.run(Task.scala:89) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)Caused by: 
> org.apache.hadoop.fs.s3a.AWSS3IOException: saving output on 
> common/hbaseHistory/metadataSept100621/_temporary/_attempt_202110060911_0001_m_001074_3/year=2021/month=09/submitDate=2021-09-08T04%3a58%3a47Z/part-r-01074-205c8b21-7840-4985-bb0e-65ed787c337d.snappy.parquet:
>  com.cloudera.com.amazonaws.services.s3.model.AmazonS3Exception: Your socket 
> connection to the server was not read from or written to within the timeout 
> period. Idle connections will be closed. (Service: Amazon S3; Status Code: 
> 400; Error Code: RequestTimeout; Request ID: 5J85XRNF10W16ZJS), S3 Extended 
> Request ID: 
> 4g08KHEDbFs5jueJqt9Snw7Xlmw5VeS1eCtJyAzp0fzHGinMhBntwIEhddJP7LLaS0teR3EAuOI=: 
> Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout; Request ID: 5J85XRNF10W16ZJS) 
> at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:143) at 
> org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:123) at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at 
> parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:470) at 
> parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:112)
>  at parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:112) at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.close(ParquetRelation.scala:101)
>  at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer$$anonfun$writeRows$4.apply$mcV$sp(WriterContainer.scala:387)
>  at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer$$anonfun$writeRows$4.apply(WriterContainer.scala:343)
>  at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer$$anonfun$writeRows$4.apply(WriterContainer.scala:343)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1278)
>  at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer.writeRows(WriterContainer.scala:409)
>  ... 8 more Suppressed: java.lang.NullPointerException at 
> parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:152)
>  at 
> parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:111)
>  at parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:112) at 
> org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.close(ParquetRelation.scala:101)
>  at 
> org.apache.spark.sql.execution.datasources.DynamicPartitionWriterContainer$$anonfun$writeRows$5.apply$mcV$sp(WriterContainer.scala:411)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1287)
>  ... 9 moreCaused by: 
> com.cloudera.com.amazonaws.services.s3.model.AmazonS3Exception: Your socket 
> connection to the server was not read from or written to within the timeout 
> period. Idle connections will be closed. (Service: Amazon S3; Status Code: 
> 400; Error Code: RequestTimeout; Request ID: 5J85XRNF10W16ZJS), S3 Extended 
> Request ID: 
> 4g08KHEDbFs5jueJqt9Snw7Xlmw5VeS1eCtJyAzp0fzHGinMhBntwIEhddJP7LLaS0teR3EAuOI= 
> at 
> com.cloudera.com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
>  at 
> com.cloudera.com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
>  at 
> com.cloudera.com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
>  at 
> com.cloudera.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
>  at 
> com.cloudera.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
>  at 
> com.cloudera.com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1472)
>  at 
> com.cloudera.com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
>  at 
> com.cloudera.com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
>  at 
> com.cloudera.com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
>  at 
> com.cloudera.com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more
>  Driver stacktrace:
>  
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to