Its not possible to configure Spark to do checks based on xmls. You would need to write jobs to do the validations you need.
On Fri, Mar 27, 2015 at 5:13 PM, Sathish Kumaran Vairavelu < vsathishkuma...@gmail.com> wrote: > Hello, > > I want to check if there is any way to check the data integrity of the > data files. The use case is perform data integrity check on large files > 100+ columns and reject records (write it another file) that does not meet > criteria's (such as NOT NULL, date format, etc). Since there are lot of > columns/integrity rules we should able to data integrity check through > configurations (like xml, json, etc); Please share your thoughts.. > > > Thanks > > Sathish > -- [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> *Arush Kharbanda* || Technical Teamlead ar...@sigmoidanalytics.com || www.sigmoidanalytics.com