Hi Fokko Spark fires it off for many other things. It does so for ML pipelines and it does make information available for data frames.
We use S3 in this case I just simplified the example. It is important to know what process took what action. Only spark knows this and it does supply this information at other occasions. So I don't think your comment makes sense? Cheers Bolke Op ma 15 okt. 2018 19:05 schreef Driesprong, Fokko <fo...@driesprong.frl>: > Hi Bolke, > > I would argue that Spark is not the right level of abstraction of doing > this. I would create a wrapper around the particular filesystem: > http://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html > Therefore you can write a wrapper around the LocalFileSystem if data will > be written to local disk, DistributedFileSystem when written to HDFS, and > also many object stores implements this interface. My 2¢ > > Cheers, Fokko > > Op ma 15 okt. 2018 om 18:58 schreef Bolke de Bruin <bdbr...@gmail.com>: > >> Hi, >> >> Apologies upfront if this should have gone to user@ but it seems a >> developer question so here goes. >> >> We are trying to improve a listener to track lineage across our platform. >> This requires tracking where data comes from and where it goes to. E.g. >> >> sc.setLogLevel("INFO"); >> val data = sc.textFile("hdfs://migration/staffingsec/Mydata.gz") >> data.saveAsTextFile ("hdfs://datalab/user/xxx”); >> >> In this case we would like to know that Spark picked up “Mydata.gz” and >> wrote it to “xxx”. Of course more complex examples are possible. >> >> In the particular case of the above Spark (2.3.2) does not seem trigger >> any events, or at least not that we know of that give us the relevant >> information. >> >> Is that a correct assessment? What can we do to get that information >> without knowing the code upfront? Should we provide a patch? >> >> Thanks >> Bolke >> >> Verstuurd vanaf mijn iPad >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>