Re: What do you pay attention to when validating Spark jobs?

2017-11-21 Thread lucas.g...@gmail.com
I don't think these will blow anyones minds but: 1) Row counts. Most of our jobs 'recompute the world' nightly so we can expect to see fairly predictable row variances. 2) Rolling snapshots. We can also expect that for some critical datasets we can compute a rolling average for important

What do you pay attention to when validating Spark jobs?

2017-11-21 Thread Holden Karau
Hi Folks, I'm working on updating a talk and I was wondering if any folks in the community wanted to share their best practices for validating your Spark jobs? Are there any counters folks have found useful for monitoring/validating your Spark jobs? Cheers, Holden :) -- Twitter: