My previous answers to this question can be found in the archives, along with some other responses:
http://apache-spark-user-list.1001560.n3.nabble.com/testing-frameworks-td32251.html https://www.mail-archive.com/user%40spark.apache.org/msg48032.html I have made a couple of presentations on the subject. Slides and video are linked on this page: http://www.mapflat.com/presentations/ You can find more material in this list of resources: http://www.mapflat.com/lands/resources/reading-list Happy testing! Regards, Lars Albertsson Data engineering entrepreneur www.mimeria.com, www.mapflat.com https://twitter.com/lalleal +46 70 7687109 On Thu, Nov 15, 2018 at 6:45 PM <omer.ozsaka...@sony.com> wrote: > Hi all, > > > > How are you testing your Spark applications? > > We are writing features by using Cucumber. This is testing the behaviours. > Is this called functional test or integration test? > > > > We are also planning to write unit tests. > > > > For instance we have a class like below. It has one method. This methos is > implementing several things: like DataFrame operations, saving DataFrame > into database table, insert, update,delete statements. > > > > Our classes generally contains 2 or 3 methods. These methods cover a lot > of tasks in the same function defintion. (like the function below) > > So I am not sure how I can write unit tests for these classes and methods. > > Do you have any suggestion? > > > > class CustomerOperations > > > > def doJob(inputDataFrame : DataFrame) = { > > // definitions (value/variable) > > // spark context, session etc definition > > > > // filtering, cleansing on inputDataframe and save results on a > new dataframe > > // insert new dataframe to a database table > > // several insert/update/delete statements on the database tables > > > > } > > > > >