Hi Chan,
Flink sources support giving a directory as an input path in a source. If
you do this it will read each of the files in that directory. They way you
do it leads to a very big plan, because the plan will be replicated 1500
times, this could lead to the OutOfMemoryException.

Is there a specific reason why you create 1500 separate sources?

Regards,
Aljoscha

On Tue, 30 Jun 2015 at 17:17 chan fentes <chanfen...@gmail.com> wrote:

> Hello,
>
> how many data sources can I use in one Flink plan? Is there any limit? I
> get an
> java.lang.OutOfMemoryException: unable to create native thread
> when having approx. 1500 files. What I basically do is the following:
> DataSource ->Map -> Map -> GroupBy -> GroupReduce per file
> and then
> Union -> GroupBy -> Sum in a tree-like reduction.
>
> I have checked the workflow. It runs on a cluster without any problem, if
> I only use few files. Does Flink use a thread per operator? It seems as if
> I am limited in the amount of threads I can use. How can I avoid the
> exception mentioned above?
>
> Best regards
> Chan
>

Reply via email to