I'm writing unit tests with Spark and need some help.

I've already read this helpful article:
http://blog.quantifind.com/posts/spark-unit-test/

There are a couple differences in my testing environment versus the blog.
1. I'm using FunSpec instead of FunSuite. So my tests look like

class MyTestSpec {
describe("A suite of tests") {
  it("should do something") {
     // test code
  }
  it("should do something else") {
     // test code
  }
}
describe("Another suite of tests") {
  it("should do something") {
     // test code
  }
  it("should do something else") {
     // test code
  }
}
}
2. I'd like to ideally reuse the SparkContext as much as possible.
Currently I'm using fixture.FunSpec's withFixture and using the loan
pattern to loan the SparkContext to the test.

So,
trait SparkEnvironment extends fixture.FunSpec {
def withFixture(test: OneArgTest) {
    val sc = SparkUtils.createSparkContext("local", "some name")

    try {
      test(sc)
    } finally {
      sc.stop
      System.clearProperty("spark.driver.port")
    }
  }
}


While that works, it ends up creating a spark context per test. I'd like to
ideally share it across all suites (so, across more than one of my TestSpec
classes), and less preferably, across multiple suites within a MyTestSpec
class, and even less preferably, across tests within a suite, but don't
know how. Right now, each of my "it" tests creates a new spark context, and
it's really slowing it down.

I tried creating a singleton object and loaning that object to multiple
tests, but Spark threw an exception saying it can't find some file. I'm
sure its something I'm (not) doing, as I can't think of a reason why
SparkContexts can't be shared across tests like that.

object SparkEnvironment {
var _sc: SparkContext = null
def sc = {
  if(_sc == null) _sc = SparkUtils.createSparkContext(..)
  _sc
}
}

Thanks,
Ameet

Reply via email to