Hi,

Apologies upfront if this should have gone to user@ but it seems a developer 
question so here goes.

We are trying to improve a listener to track lineage across our platform. This 
requires tracking where data comes from and where it goes to. E.g.

sc.setLogLevel("INFO"); 
val data = sc.textFile("hdfs://migration/staffingsec/Mydata.gz")
data.saveAsTextFile ("hdfs://datalab/user/xxx”);

In this case we would like to know that Spark picked up “Mydata.gz” and wrote 
it to “xxx”. Of course more complex examples are possible.

In the particular case of the above Spark (2.3.2) does not seem trigger any 
events, or at least not that we know of that give us the relevant information. 

Is that a correct assessment? What can we do to get that information without 
knowing the code upfront? Should we provide a patch?

Thanks
Bolke

Verstuurd vanaf mijn iPad
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to