Hi Spark users, I need a small help in collecting stage level information of Spark workflows. I have added a a listener to spark context and the goal is to monitor each stage. There are two issues I am struggling with:
1. Trying to find the input data paths for each stage. Looking at the code, it seems that stage objects do not maintain this information. But is it possible to obtain it by using RDD dependencies associated with a stage? 2. Getting hold of stage level information like numTasks. Some of the member variables in classes Stage and StageInfo are inaccessible from outside even though they are not marked as private. Thanks and regards, ~Mayuresh
