+1 I find this useful as well. On 04 Dec 2014, at 22:02, Robert Metzger <[email protected]> wrote:
> +1 for adding such a feature. It should be very easy to implement (basically > extend the createInputSplits() method) > > On Tue, Dec 2, 2014 at 5:22 PM, Vasiliki Kalavri <[email protected]> > wrote: > Hi, > > thanks for replying! > > It would certainly be useful for my use case, but not absolutely necessary. > If you think other people might find it useful too, I can open a issue. > If not, I believe it would be nice to print a warning when a nested directory > is given as input path, > since now, the files that are in the base directory are normally processed, > but the nested ones are simply ignored. > > Cheers, > V. > > On 2 December 2014 at 16:52, Stephan Ewen <[email protected]> wrote: > Hi! > > Not right now. The input formats do not recursively enumerate files. In that, > we followed the way Hadoop did it. > > If that is something that is interesting, it should not be too hard to add to > the FileInputFormat an option to do a complete recursive traversal of the > directory structure. > > Greetings, > Stephan > > > On Tue, Dec 2, 2014 at 4:32 PM, Vasiliki Kalavri <[email protected]> > wrote: > Hello all, > > I want to run a Flink log processing job and my input is stored locally in a > nested directory structure, like the following: > > logs_dir/ > |-----/machine1/ > |-----------/january.log > |-----------/february.log > ... > |-----/machine2/ > ... > > etc. > > When providing "logs_dir" as the argument to readTextFile(), nothing is read > and no an exception or error is returned. > Copying the nested individual files machine1/january.log, > machine1/february.log, ..., to the same directory works fine, but I was > wondering whether there is a better way to do this? > > Thank you! > V. > > >
