Spark benchmarks
I'm mainly interested in evaluating cache performance under different scenarios. (By "non-trivial", I mean more than simply caching a single input RDD which is reusued a few times.) Any suggestions appreciated! [0] https://github.com/CODAIT/spark-bench -- Michael Mior mm...@
Task partition ID in Spark event logs
field inside TaskInfo? Cheers, -- Michael Mior mm...@apache.org