My guess is that your executor already crashed, due to OOM?. You should check the executor log, it may tell you more information.
Yong ________________________________ From: Rohit Verma <rohit.ve...@rokittech.com> Sent: Thursday, March 9, 2017 4:41 AM To: user Subject: Spark failing while persisting sorted columns. Hi all, Please help me with below scenario. While writing below query on large dataset (rowCount=100,000,000) using below query // there are other instance of below job submitting to spark in multithreaded app. final Dataset<Row> df = spark.read().parquet(tablePath); // df storage is hdfs is 5.64 GB with 45 blocks. df.select(col).na().drop().dropDuplicates(col).coalesce(20).sort(df.col(col)).coalesce(1).write().mode(SaveMode.Ignore).csv(path); Getting below exception. Task failed while writing rows at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:261) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 2991 Here are spark env details: * Cores in use: 20 Total, 0 Used * Memory in use: 72.2 GB Total, 0.0 B Used And process configuration are as "spark.cores.max", “20" "spark.executor.memory", “3400MB" “spark.kryoserializer.buffer.max”,”1000MB” Any leads would be highly appreciated. Regards Rohit Verma