GitHub user sarutak opened a pull request:
https://github.com/apache/spark/pull/5341
spark-sql script ends up throwing Exception when event logging is enabled.
When event logging is enabled, spark-sql script ends up throwing Exception
like as follows.
```
15/04/03 13:51:49 INFO handler.ContextHandler: stopped
o.e.j.s.ServletContextHandler{/jobs,null}
15/04/03 13:51:49 ERROR scheduler.LiveListenerBus: Listener
EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
at
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
at
org.apache.spark.scheduler.EventLoggingListener.onApplicationEnd(EventLoggingListener.scala:188)
at
org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:54)
at
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
at
org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1171)
at
org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
at
org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1843)
at
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1804)
at
org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:127)
... 17 more
15/04/03 13:51:49 INFO ui.SparkUI: Stopped Spark web UI at
http://sarutak-devel:4040
15/04/03 13:51:49 INFO scheduler.DAGScheduler: Stopping DAGScheduler
Exception in thread "Thread-6" java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1760)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
at
org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:209)
at
org.apache.spark.SparkContext$$anonfun$stop$3.apply(SparkContext.scala:1408)
at
org.apache.spark.SparkContext$$anonfun$stop$3.apply(SparkContext.scala:1408)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1408)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.stop(SparkSQLEnv.scala:66)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$$anon$1.run(SparkSQLCLIDriver.scala:107)
```
This is because FileSystem#close is called by the shutdown hook registered
in SparkSQLCLIDriver.
```
Runtime.getRuntime.addShutdownHook(
new Thread() {
override def run() {
SparkSQLEnv.stop()
}
}
)
```
This issue was resolved by SPARK-3062 but I think, it's brought again by
SPARK-2261.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sarutak/spark SPARK-6690
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/5341.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5341
----
commit d05f4d1f548e8af805bfe5febdd31d0ffafe2256
Author: Kousuke Saruta <[email protected]>
Date: 2015-04-03T05:37:24Z
Fixed a race condition related to o.a.h.f.FileSystem
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]