Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/204#issuecomment-39454420
First of all I would like to say great work, having this history server is
awesome!!
I'm trying it out on a non-secure yarn cluster but when I run something
that accesses hdfs (SparkHdfsLR) the application ends up failing because HDFS
has already been closed when it goes to logEvent. I'll file a separate jira
for that since its not directly related to this PR. But this also causes
exceptions in the history server, where once it sees something it doesn't like
it throws exception and stops reading the directories (even the good ones)
14/04/03 14:01:18 ERROR history.HistoryServer: Unable to synchronize
HistoryServer with files on disk:
java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:398)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1284)
at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1269)
at
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:358)
at
org.apache.spark.deploy.history.HistoryServer.org$apache$spark$deploy$history$HistoryServer$$getModificationTime(HistoryServer.scala:171)
at
org.apache.spark.deploy.history.HistoryServer$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(HistoryServer.scala:107)
at
org.apache.spark.deploy.history.HistoryServer$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(HistoryServer.scala:104)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at
org.apache.spark.deploy.history.HistoryServer$$anonfun$1.apply$mcV$sp(HistoryServer.scala:104)
at
org.apache.spark.deploy.history.HistoryServer$$anonfun$1.apply(HistoryServer.scala:99)
at
org.apache.spark.deploy.history.HistoryServer$$anonfun$1.apply(HistoryServer.scala:99)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at
scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107)
at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---