Hi, I have a TPCDS query that fails in the stage 80 which is a ResultStage (SparkSQL). Ideally I would like to ‘checkpoint’ a previous stage which was executed successfully and replay the failed stage for debug purposes. Anyone managed to do something similar that could point some hints? Maybe someone used a tool like DMTCP [1] and it can be applied in this situation?
[1] http://dmtcp.sourceforge.net/ <http://dmtcp.sourceforge.net/> Best, Ovidiu