[ https://issues.apache.org/jira/browse/SPARK-28340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen reassigned SPARK-28340: --------------------------------- Assignee: Colin Ma > Noisy exceptions when tasks are killed: "DiskBlockObjectWriter: Uncaught > exception while reverting partial writes to file: > java.nio.channels.ClosedByInterruptException" > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: SPARK-28340 > URL: https://issues.apache.org/jira/browse/SPARK-28340 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.0.0 > Reporter: Josh Rosen > Assignee: Colin Ma > Priority: Minor > > If a Spark task is killed while writing blocks to disk (due to intentional > job kills, automated killing of redundant speculative tasks, etc) then Spark > may log exceptions like > {code:java} > 19/07/10 21:31:08 ERROR storage.DiskBlockObjectWriter: Uncaught exception > while reverting partial writes to file /<FILENAME> > java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.FileChannelImpl.truncate(FileChannelImpl.java:372) > at > org.apache.spark.storage.DiskBlockObjectWriter$$anonfun$revertPartialWritesAndClose$2.apply$mcV$sp(DiskBlockObjectWriter.scala:218) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1369) > at > org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(DiskBlockObjectWriter.scala:214) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.stop(BypassMergeSortShuffleWriter.java:237) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:105) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55) > at org.apache.spark.scheduler.Task.run(Task.scala:121) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){code} > If {{BypassMergeSortShuffleWriter}} is being used then a single cancelled > task can result in hundreds of these stacktraces being logged. > Here are some StackOverflow questions asking about this: > * [https://stackoverflow.com/questions/40027870/spark-jobserver-job-crash] > * > [https://stackoverflow.com/questions/50646953/why-is-java-nio-channels-closedbyinterruptexceptio-called-when-caling-multiple] > * > [https://stackoverflow.com/questions/41867053/java-nio-channels-closedbyinterruptexception-in-spark] > * > [https://stackoverflow.com/questions/56845041/are-closedbyinterruptexception-exceptions-expected-when-spark-speculation-kills] > > Can we prevent this exception from occurring? If not, can we treat this > "expected exception" in a special manner to avoid log spam? My concern is > that the presence of large numbers of spurious exceptions is confusing to > users when they are inspecting Spark logs to diagnose other issues. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org