GitHub user sarutak opened a pull request:

    https://github.com/apache/spark/pull/5341

    spark-sql script ends up throwing Exception when event logging is enabled.

    When event logging is enabled, spark-sql script ends up throwing Exception 
like as follows.
    
    ```
    15/04/03 13:51:49 INFO handler.ContextHandler: stopped 
o.e.j.s.ServletContextHandler{/jobs,null}
    15/04/03 13:51:49 ERROR scheduler.LiveListenerBus: Listener 
EventLoggingListener threw an exception
    java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
        at 
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
        at scala.Option.foreach(Option.scala:236)
        at 
org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
        at 
org.apache.spark.scheduler.EventLoggingListener.onApplicationEnd(EventLoggingListener.scala:188)
        at 
org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:54)
        at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
        at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
        at 
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
        at 
org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
        at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
        at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1171)
        at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
    Caused by: java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1843)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1804)
        at 
org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:127)
        ... 17 more
    15/04/03 13:51:49 INFO ui.SparkUI: Stopped Spark web UI at 
http://sarutak-devel:4040
    15/04/03 13:51:49 INFO scheduler.DAGScheduler: Stopping DAGScheduler
    Exception in thread "Thread-6" java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1760)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
        at 
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:209)
        at 
org.apache.spark.SparkContext$$anonfun$stop$3.apply(SparkContext.scala:1408)
        at 
org.apache.spark.SparkContext$$anonfun$stop$3.apply(SparkContext.scala:1408)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.SparkContext.stop(SparkContext.scala:1408)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
        at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)
    ```
    
    This is because FileSystem#close is called by the shutdown hook registered 
in SparkSQLCLIDriver.
    
    ```
        Runtime.getRuntime.addShutdownHook(
          new Thread() {
            override def run() {
              SparkSQLEnv.stop()
            }
          }
        )
    ```
    
    This issue was resolved by SPARK-3062 but I think, it's brought again by 
SPARK-2261.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sarutak/spark SPARK-6690

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/5341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5341
    
----
commit d05f4d1f548e8af805bfe5febdd31d0ffafe2256
Author: Kousuke Saruta <[email protected]>
Date:   2015-04-03T05:37:24Z

    Fixed a race condition related to o.a.h.f.FileSystem

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to