Oh, that's sweet. So, a related question then. Did those tests pick up the performance issue reported in SPARK-3333 <https://issues.apache.org/jira/browse/SPARK-3333>? Does it make sense to add a new test to cover that case?
On Tue, Sep 2, 2014 at 12:29 AM, Matei Zaharia <matei.zaha...@gmail.com> wrote: > Hi Nicholas, > > At Databricks we already run https://github.com/databricks/spark-perf for > each release, which is a more comprehensive performance test suite. > > Matei > > On September 1, 2014 at 8:22:05 PM, Nicholas Chammas ( > nicholas.cham...@gmail.com) wrote: > > What do people think of running the Big Data Benchmark > <https://amplab.cs.berkeley.edu/benchmark/> (repo > <https://github.com/amplab/benchmark>) as part of preparing every new > release of Spark? > > We'd run it just for Spark and effectively use it as another type of test > to track any performance progress or regressions from release to release. > > Would doing such a thing be valuable? Do we already have a way of > benchmarking Spark performance that we use regularly? > > Nick > >