Ok thanks! I think that makes sense. I just got into some strange problems when mixing this with KafkaIO. I'll checkout the Watch transform.
// Vilhelm On Wed, Feb 21, 2018 at 6:12 PM, Eugene Kirpichov <[email protected]> wrote: > Hi! Currently it's not configurable - the watermark is min(timestamp of > pending files) where timestamp is simply time when the file was seen. > However, it's implemented as a very thin wrapper on top of the Watch > transform, see implementation of FileIO.matchAll(). You can use the Watch > transform directly and write a PollFn that uses FileSystems.match() to list > the files and computes file timestamps and watermark in the way appropriate > to your use case. > > On Wed, Feb 21, 2018 at 7:29 AM Vilhelm von Ehrenheim < > [email protected]> wrote: > >> Hi! >> I have some problems with watermarks using the watchForNewFiles feature >> in TextIO to read . The watermark is not progressing even though no new >> files are added to the location. Does anyone know what the logic is behind >> this watermark, and if it is possible to supply your own heuristic? I can't >> find it anywhere in the source. >> >> Regards, >> Vilhelm von Ehrenheim >> >> >>
