Hi,

Sorry for the very slow reply - I am far behind in my mailing list
subscriptions.

You'll find a few slides covering the topic in this presentation:
https://www.slideshare.net/lallea/test-strategies-for-data-processing-pipelines-67244458

Video here: https://vimeo.com/192429554

Regards,

Lars Albertsson
Data engineering entrepreneur
www.scling.com, www.mapflat.com
https://twitter.com/lalleal
+46 70 7687109

On Tue, Feb 25, 2020 at 7:46 PM Ruijing Li <liruijin...@gmail.com> wrote:
>
> Just wanted to follow up on this. If anyone has any advice, I’d be interested 
> in learning more!
>
> On Thu, Feb 20, 2020 at 6:09 PM Ruijing Li <liruijin...@gmail.com> wrote:
>>
>> Hi all,
>>
>> I’m interested in hearing the community’s thoughts on best practices to do 
>> integration testing for spark sql jobs. We run a lot of our jobs with cloud 
>> infrastructure and hdfs - this makes debugging a challenge for us, 
>> especially with problems that don’t occur from just initializing a 
>> sparksession locally or testing with spark-shell. Ideally, we’d like some 
>> sort of docker container emulating hdfs and spark cluster mode, that you can 
>> run locally.
>>
>> Any test framework, tips, or examples people can share? Thanks!
>> --
>> Cheers,
>> Ruijing Li
>
> --
> Cheers,
> Ruijing Li

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to