I'm looking for recommendations on benchmarks for Spark. I'm familiar
with spark-bench[0], but I haven't found much else that suits my
needs. The main property I'm looking for is that the workload of the
benchmark should benefit significantly from non-trivial use of Spark's
caching mechanism since I'm mainly interested in evaluating cache
performance under different scenarios.

(By "non-trivial", I mean more than simply caching a single input RDD
which is reusued a few times.)

Any suggestions appreciated!

[0] https://github.com/CODAIT/spark-bench
--
Michael Mior
mm...@apache.org

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to