There are two ways you can realize that: 1) Create multiple sources and union them. This is easy, but probably a bit less efficient.
2) Override the FileInputFormat's createInputSplits method to take a union of the paths to create a list of all files and fils splits that will be read. Stephan On Fri, Jun 26, 2015 at 12:12 PM, Michele Bertoni < michele1.bert...@mail.polimi.it> wrote: > Hi everybody, > is there a way to specify a list of URI (“hdfs://file1”,”hdfs://file2”,…) > and open them as different files? > I know i may open the entire directory, but i want to be able to select a > subset of files in the directory > > thanks