I'm looking for recommendations on benchmarks for Spark. I'm familiar with spark-bench[0], but I haven't found much else that suits my needs. The main property I'm looking for is that the workload of the benchmark should benefit significantly from non-trivial use of Spark's caching mechanism since I'm mainly interested in evaluating cache performance under different scenarios.
(By "non-trivial", I mean more than simply caching a single input RDD which is reusued a few times.) Any suggestions appreciated! [0] https://github.com/CODAIT/spark-bench -- Michael Mior mm...@apache.org --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org