[jira] [Commented] (HADOOP-14161) Failed to rename S3AFileStatus

Steve Loughran (JIRA) Fri, 10 Mar 2017 02:56:53 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-14161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904888#comment-15904888
 ]


Steve Loughran commented on HADOOP-14161:
-----------------------------------------

well, you know my stance on directs to s3, ":dont". Wait gfor HADOOP-13786.

That message is being thrown because rename returned false.
{code}
      if (!fs.rename(from.getPath(), to)) {
        throw new IOException("Failed to rename " + from + " to " + to);
      }
{code}

The fact that rename() doesn't return anything meaningfui on a failure other 
than return a "worked/didn't work" flag is just one of the many issues with 
rename, the other being "nobody quite knows what it is meant to do". There's an 
open jira on making the rename ext call public, which does throw exceptions: 
I'd like to do that and switch to it within the Hadoop codebase at the very 
least.

I suspect what's happening is that rename is failing because there is something 
at the far end, and even after a DELETE call, a HEAD request is finding it, so 
the operation rejected. 

Try cranking up the debug logging in the s3a module and see what it 
says...attach it here and I'll see what can be done. Renaming JIRA to describe 
problem better


> Failed to rename S3AFileStatus
> ------------------------------
>
>                 Key: HADOOP-14161
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14161
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.7.0, 2.7.1, 2.7.2, 2.7.3
>         Environment: spark 2.0.2 with mesos
> hadoop 2.7.2
>            Reporter: Luke Miner
>
> I'm getting non deterministic rename errors while writing to S3 using spark 
> and hadoop. The proper permissions are set and this only happens 
> occasionally. It can happen on a job that is as simple as reading in json, 
> repartitioning and then writing out. After this failure occurs, the overall 
> job hangs indefinitely.
> {code}
> org.apache.spark.SparkException: Task failed while writing rows
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:261)
>     at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>     at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
>     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>     at org.apache.spark.scheduler.Task.run(Task.scala:86)
>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Failed to commit task
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.org$apache$spark$sql$execution$datasources$DefaultWriterContainer$$commitTask$1(WriterContainer.scala:275)
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply$mcV$sp(WriterContainer.scala:257)
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:252)
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer$$anonfun$writeRows$1.apply(WriterContainer.scala:252)
>     at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1348)
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:258)
>     ... 8 more
> Caused by: java.io.IOException: Failed to rename 
> S3AFileStatus{path=s3a://foo/_temporary/0/_temporary/attempt_201703081855_0018_m_000966_0/part-r-00966-615ed714-58c1-4b89-be56-e47966737c75.snappy.parquet;
>  isDirectory=false; length=111225342; replication=1; blocksize=33554432; 
> modification_time=1488999342000; access_time=0; owner=; group=; 
> permission=rw-rw-rw-; isSymlink=false} to 
> s3a://foo/part-r-00966-615ed714-58c1-4b89-be56-e47966737c75.snappy.parquet
>     at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:415)
>     at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:428)
>     at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:539)
>     at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:502)
>     at 
> org.apache.spark.mapred.SparkHadoopMapRedUtil$.performCommit$1(SparkHadoopMapRedUtil.scala:50)
>     at 
> org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:76)
>     at 
> org.apache.spark.sql.execution.datasources.BaseWriterContainer.commitTask(WriterContainer.scala:211)
>     at 
> org.apache.spark.sql.execution.datasources.DefaultWriterContainer.org$apache$spark$sql$execution$datasources$DefaultWriterContainer$$commitTask$1(WriterContainer.scala:270)
>     ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-14161) Failed to rename S3AFileStatus

Reply via email to