[ 
https://issues.apache.org/jira/browse/SPARK-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559378#comment-14559378
 ] 

Marcelo Vanzin commented on SPARK-7865:
---------------------------------------

Just for completeness, he's reporting the issue against Hadoop 1. There's no 
nice way to fix this in Hadoop 1, so we chose not to. I'd really recommend 
upgrading to Hadoop 2 - it's been out for a long while now.

> Hadoop Filesystem for eventlog closed before sparkContext stopped
> -----------------------------------------------------------------
>
>                 Key: SPARK-7865
>                 URL: https://issues.apache.org/jira/browse/SPARK-7865
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.4.0
>            Reporter: Zhang, Liye
>
> After [SPARK-3090|https://issues.apache.org/jira/browse/SPARK-3090] (patch 
> [#5969|https://github.com/apache/spark/pull/5696]), SparkContext will be 
> automatically stop if user forget to.
> While when shutdownhook is called, Eventlog will give out following exception 
> for flushing content:
> {noformat}
> 15/05/26 17:40:38 INFO spark.SparkContext: Invoking stop() from shutdown hook
> 15/05/26 17:40:38 ERROR scheduler.LiveListenerBus: Listener 
> EventLoggingListener threw an exception
> java.lang.reflect.InvocationTargetException
>         at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at 
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>         at 
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>         at scala.Option.foreach(Option.scala:236)
>         at 
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>         at 
> org.apache.spark.scheduler.EventLoggingListener.onApplicationEnd(EventLoggingListener.scala:188)
>         at 
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:54)
>         at 
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>         at 
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>         at 
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)
>         at 
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
>         at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
>         at 
> org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180)
>         at 
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
> Caused by: java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:323)
>         at org.apache.hadoop.hdfs.DFSClient.access$1200(DFSClient.java:78)
>         at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3877)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         ... 16 more
> {noformat}
> And exception for stopping:
> {noformat}
> 15/05/26 17:40:39 INFO cluster.SparkDeploySchedulerBackend: Asking each 
> executor to shut down
> 15/05/26 17:40:39 ERROR util.Utils: Uncaught exception in thread Spark 
> Shutdown Hook
> java.io.IOException: Filesystem closed
>         at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:323)
>         at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1057)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:554)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:788)
>         at 
> org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:209)
>         at 
> org.apache.spark.SparkContext$$anonfun$stop$5.apply(SparkContext.scala:1515)
>         at 
> org.apache.spark.SparkContext$$anonfun$stop$5.apply(SparkContext.scala:1515)
>         at scala.Option.foreach(Option.scala:236)
>         at org.apache.spark.SparkContext.stop(SparkContext.scala:1515)
>         at 
> org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:527)
>         at org.apache.spark.util.SparkShutdownHook.run(Utils.scala:2211)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Utils.scala:2181)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2181)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2181)
>         at 
> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1732)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(Utils.scala:2181)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2181)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2181)
>         at scala.util.Try$.apply(Try.scala:161)
>         at 
> org.apache.spark.util.SparkShutdownHookManager.runAll(Utils.scala:2181)
>         at 
> org.apache.spark.util.SparkShutdownHookManager$$anon$6.run(Utils.scala:2163)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> The Hadoop version is 1.2.1. I'm wondering how the hadoop filesystem closed 
> while spark doesn't explicitly calling the close() API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to