Hi Chan, Flink sources support giving a directory as an input path in a source. If you do this it will read each of the files in that directory. They way you do it leads to a very big plan, because the plan will be replicated 1500 times, this could lead to the OutOfMemoryException.
Is there a specific reason why you create 1500 separate sources? Regards, Aljoscha On Tue, 30 Jun 2015 at 17:17 chan fentes <chanfen...@gmail.com> wrote: > Hello, > > how many data sources can I use in one Flink plan? Is there any limit? I > get an > java.lang.OutOfMemoryException: unable to create native thread > when having approx. 1500 files. What I basically do is the following: > DataSource ->Map -> Map -> GroupBy -> GroupReduce per file > and then > Union -> GroupBy -> Sum in a tree-like reduction. > > I have checked the workflow. It runs on a cluster without any problem, if > I only use few files. Does Flink use a thread per operator? It seems as if > I am limited in the amount of threads I can use. How can I avoid the > exception mentioned above? > > Best regards > Chan >