Github user squito commented on the pull request:
https://github.com/apache/spark/pull/6935#issuecomment-115849879
sorry I didn't see that comment about the tests ... I wrote something up
which mimics the existing `UISeleniumSuite`.
```scala
From 2dddcd036a4cacf31d1ff5f15417fcdbcfdbc22e Mon Sep 17 00:00:00 2001
From: Imran Rashid <[email protected]>
Date: Fri, 26 Jun 2015 14:16:16 -0500
Subject: [PATCH] failing test case
---
.../spark/deploy/history/HistoryServerSuite.scala | 74
+++++++++++++++++++++-
1 file changed, 72 insertions(+), 2 deletions(-)
diff --git
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
index e5b5e1b..a288490 100644
---
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
+++
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
@@ -21,14 +21,24 @@ import java.net.{HttpURLConnection, URL}
import java.util.zip.ZipInputStream
import javax.servlet.http.{HttpServletRequest, HttpServletResponse}
+import scala.concurrent.duration._
+import scala.language.postfixOps
+
import com.google.common.base.Charsets
import com.google.common.io.{ByteStreams, Files}
import org.apache.commons.io.{FileUtils, IOUtils}
+import org.json4s.JsonAST.JArray
+import org.json4s.jackson.JsonMethods._
import org.mockito.Mockito.when
+import org.openqa.selenium.WebDriver
+import org.openqa.selenium.htmlunit.HtmlUnitDriver
+import org.scalatest.concurrent.Eventually
+import org.scalatest.selenium.WebBrowser
import org.scalatest.{BeforeAndAfter, Matchers}
import org.scalatest.mock.MockitoSugar
-import org.apache.spark.{JsonTestUtils, SecurityManager, SparkConf,
SparkFunSuite}
+import org.apache.spark.util.Utils
+import org.apache.spark._
import org.apache.spark.ui.SparkUI
/**
@@ -43,7 +53,7 @@ import org.apache.spark.ui.SparkUI
* are considered part of Spark's public api.
*/
class HistoryServerSuite extends SparkFunSuite with BeforeAndAfter with
Matchers with MockitoSugar
- with JsonTestUtils {
+ with JsonTestUtils with Eventually with WebBrowser {
private val logDir = new File("src/test/resources/spark-events")
private val expRoot = new
File("src/test/resources/HistoryServerExpectations/")
@@ -264,6 +274,66 @@ class HistoryServerSuite extends SparkFunSuite with
BeforeAndAfter with Matchers
justHrefs should contain(link)
}
+ test("incomplete apps get refreshed") {
+
+ implicit val webDriver: WebDriver = new HtmlUnitDriver
+ implicit val formats = org.json4s.DefaultFormats
+
+ val testDir = Utils.createTempDir()
+ testDir.deleteOnExit()
+
+ val myConf = new SparkConf()
+ .set("spark.history.fs.logDirectory", testDir.getAbsolutePath)
+ .set("spark.eventLog.dir", testDir.getAbsolutePath)
+ .set("spark.history.fs.update.interval", "1")
+ .set("spark.eventLog.enabled", "true")
+ .set("spark.history.cache.interval", "1")
+ .remove("spark.testing")
+ val provider = new FsHistoryProvider(myConf)
+ val securityManager = new SecurityManager(myConf)
+
+ val server = new HistoryServer(myConf, provider, securityManager,
18080)
+ val sc = new SparkContext("local", "test", myConf)
+ server.initialize()
+ server.bind()
+ val port = server.boundPort
+ try {
+ val d = sc.parallelize(1 to 10)
+ d.count()
+ val appId = eventually(timeout(20 seconds), interval(100
milliseconds)) {
+ val json = getContentAndCode("applications", port)._2.get
+ val apps = parse(json).asInstanceOf[JArray].arr
+ apps.size should be (1)
+ (apps(0) \ "id").extract[String]
+ }
+
+ def getNumJobs(suffix: String): Int = {
+ go to (s"http://localhost:$port/history/$appId$suffix")
+ findAll(cssSelector("tbody tr")).toIndexedSeq.size
+ }
+
+ getNumJobs("") should be (1)
+ d.count()
+ eventually(timeout(10 seconds), interval(100 milliseconds)) {
+ // this fails with patch https://github.com/apache/spark/pull/6935
as well, I'm not sure why
+ getNumJobs("") should be (2)
+ }
+
+ getNumJobs("/jobs") should be (2)
+ // Here is where we still have an error. /jobs is handled by
another servlet,
+ // not controlled by the app cache. We need some way for each of
those servlets
+ // to know that they can get expired, so they can be detached and
then the reloaded
+ // one can get re-attached
+ d.count()
+ getNumJobs("/jobs") should be (3)
+
+ } finally {
+ sc.stop()
+ server.stop()
+ }
+
+ }
+
def getContentAndCode(path: String, port: Int = port): (Int,
Option[String], Option[String]) = {
HistoryServerSuite.getContentAndCode(new
URL(s"http://localhost:$port/api/v1/$path"))
}
--
2.2.2
```
patch https://github.com/apache/spark/pull/6545 is closer to working (for
some reason the cache does not seem to be getting refreshed in this PR, maybe
my test is setup wrong). But I do think the cache with a refresh interval is
probably a better way to go, rather than reloading the data on every single
request.
I have a rough understanding of what is going wrong when we make requests
to `http://localhost:18080/history/<appId>/jobs` -- after the first time, that
is handled by a completely separate servlet, as @XuTingjun has pointed out,
which is never detached. We need a way to wrap each of the servlets for the UI
so they know if they need to refresh.
Also I just realized, the same problem applies to the rest api at
`http://localhost:18080/api/v1/applications/`, but that should be much simpler
to handle, its just one servlet. We should add a test case and fix that too,
though.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]