[jira] [Resolved] (SPARK-18883) FileNotFoundException on _temporary directory

Steve Loughran (JIRA) Mon, 08 Jan 2018 06:27:19 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran resolved SPARK-18883.
------------------------------------
    Resolution: Won't Fix

I'm going to close as a WONTFIX, because the solution is "don't use 
FileOutputCommitter to write to eventually consistent object stores". It 
expects newly created files and directories to exist, but with eventual 
consistency listings can break that expectation.

with HADOOP-13345/s3guard turned on you don't get the inconsistency, but you 
still get awful commit times and weak failure recovery: the expectation 
"rename() is fast and atomic" is also broken.

Best to wait for a version of the Hadoop JARs containing the HADOOP-13786 
committer, which is explicitly designed to work with S3.

> FileNotFoundException on _temporary directory 
> ----------------------------------------------
>
>                 Key: SPARK-18883
>                 URL: https://issues.apache.org/jira/browse/SPARK-18883
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.2
>         Environment: We're on a CDH 5.7, Hadoop 2.6.
>            Reporter: Mathieu DESPRIEE
>
> I'm experiencing the following exception, usually after some time with heavy 
> load :
> {code}
> 16/12/15 11:25:18 ERROR InsertIntoHadoopFsRelationCommand: Aborting job.
> java.io.FileNotFoundException: File 
> hdfs://nameservice1/user/xdstore/rfs/rfsDB/_temporary/0 does not exist.
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:795)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1557)
>         at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.getAllCommittedTaskPaths(FileOutputCommitter.java:291)
>         at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJobInternal(FileOutputCommitter.java:361)
>         at 
> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:334)
>         at 
> org.apache.parquet.hadoop.ParquetOutputCommitter.commitJob(ParquetOutputCommitter.java:46)
>         at 
> org.apache.spark.sql.execution.datasources.BaseWriterContainer.commitJob(WriterContainer.scala:222)
>         at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:144)
>         at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
>         at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
>         at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
>         at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115)
>         at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>         at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>         at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
>         at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>         at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>         at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
>         at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>         at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
>         at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
>         at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
>         at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
>         at 
> org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:525)
>         at 
> org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
>         at 
> org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194)
>         at 
> org.apache.spark.sql.DataFrameWriter.parquet(DataFrameWriter.scala:488)
>         at 
> com.bluedme.woda.ng.indexer.RfsRepository.append(RfsRepository.scala:36)
>         at 
> com.bluedme.woda.ng.indexer.RfsRepository.insert(RfsRepository.scala:23)
>         at 
> com.bluedme.woda.cmd.ShareDatasetImpl.runImmediate(ShareDatasetImpl.scala:33)
>         at 
> com.bluedme.woda.cmd.ShareDatasetImpl.runImmediate(ShareDatasetImpl.scala:13)
>         at 
> com.bluedme.woda.cmd.ImmediateCommandImpl$$anonfun$run$1.apply(CommandImpl.scala:21)
>         at 
> com.bluedme.woda.cmd.ImmediateCommandImpl$$anonfun$run$1.apply(CommandImpl.scala:21)
>         at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>         at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>         at 
> scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
>         at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
> Looks similar to [SPARK-18512] although it's not the same environment : no 
> streaming, no S3 here. Final path in stack different.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-18883) FileNotFoundException on _temporary directory

Reply via email to