Hi! I have encountered this a few times but have only solved it using some ugly hack so far so I thought I'd ask this time.
If I have a bunch of files with timestamps in their names but with no timestamps in the data, how should I best read them into a PCollection of timestamped values? The files are json or CSV files and is I use the TextIO.read I don't have the filenames available anymore. If the best way to do this to write your own source? In that case how can I most easily get the filename or timestamp into the data using essentially everything else from TextIO? I tried doing this using a filebased source but it didn't pan out too well. Or is it better to do a DoFn that reads a PCollection of filenames and then itself reads these files and fan-out? I have had some bad experiences with fan-out so I'm not sure this is good either. If anyone has solved this it would be really interesting to know what the best approach would be. Thanks! Vilhelm von Ehrenheim
