Hi there! Does Apache Spark offers callback or some kind of plugin mechanism that would allow this?
For a bit more context, I am looking to create a Record-Replay environment using Apache Spark as the core processing engine. The goals are: 1. Trace the origins of every file generated by Spark: which application generated them (jar file), its input files and parameters. 2. To be able to re-run a previous experiment easily, and ideally, be able to change the parameters (eg. input files, etc.). Any advice about how to tackle these goals ? -D