Hi,

I have a data set backed by a directory of files in which file names are
meaningful.

folder1
   +-----file01
   +-----file02
   +-----file03
   +-----file04

I want to control the file assignments in my flink application. For
example, when parallelism is 2, worker 1 get file01 and file02 to read and
worker2 get 3 and 4. Also each worker get 2 files all at once because
reading requires jumping back and forth between those two files.

What's the best way to do this? It seems like FileInputFormat is not
extensible in this case.

Best
Lu

Reply via email to