Re: [DISCUSS] "latestFirst" option and metadata growing issue in File stream source

2020-07-19 Thread Jungtaek Lim
(Just to add rationalization, you can refer the original mail thread on dev@ list to see efforts on addressing problems in file stream source / sink - https://lists.apache.org/thread.html/r1cd548be1cbae91c67e5254adc0404a99a23930f8a6fde810b987285%40%3Cdev.spark.apache.org%3E ) On Mon, Jul 20, 2020

[DISCUSS] "latestFirst" option and metadata growing issue in File stream source

2020-07-19 Thread Jungtaek Lim
Hi devs, As I have been going through the various issues on metadata log growing, it's not only the issue of sink, but also the issue of source. Unlike sink metadata log which entries should be available to the readers, the source metadata log is only for the streaming query starting from the