Re: [FlinkRunner] Exception "Insufficient number of network buffers" thrown from time to time.

Stephan Ewen Wed, 06 Jul 2016 11:53:05 -0700

Hi Amir!

When all happens on one node, you probably set the parallelism so low that
all can be put onto one node.
For example each node offers 8 slots (16 slots total), but your parallelism
is only 8.


You can think of it in similar terms as in YARN: A node can offer 8
containers, and when your job wants only 8 containers, it can happen that
all are on one node.


In order to spread computation out, you can do one of those two things:

(1) Increase the parallelism (in this example to 16) - that gives best
resource utilization

(2) If you want to keep a parallelism of 8 and be sure that everything is
on two nodes, you can decrease the slots per node to 4 so that there are 8
across two nodes.


Greetings,
Stephan


On Wed, Jul 6, 2016 at 7:18 PM, amir bahmanyari <[email protected]> wrote:

> Hi Colleagues,
> I get the same time-to-time issue with Beam FlinkRunner in a Flink cluster
> (2 nodes).
> Also, 98% of the time, all happens in only one node (no LB in the cluster).
> FYI.
> Thanks
> Amir-
>
>
> ------------------------------
> *From:* Stephan Ewen <[email protected]>
> *To:* [email protected]
> *Sent:* Wednesday, July 6, 2016 7:22 AM
> *Subject:* Re: [FlinkRunner] Exception "Insufficient number of network
> buffers" thrown from time to time.
>
> The number of network buffers is a parameter one sometimes need to
> configure in Flink.
>
> For Flink's own API, you can explicitly create a
> LocalStreamExecutionEnvironment from a configuration and use that one for
> the execution.
>
> For the FlinkRunner in Beam, it would make sense to be able to pass an
> ExecutionEnvironment object to use for the program execution.
> Then the fix would be to create a LocalStreamExecutionEnvironment from a
> config and pass that to the Flink Runner.
>
>
> On Wed, Jul 6, 2016 at 3:36 PM, Aljoscha Krettek <[email protected]>
> wrote:
>
> Could you maybe send me a minimal version of that such that I can
> reproduce it?
>
> On Wed, 6 Jul 2016 at 12:29 Pawel Szczur <[email protected]> wrote:
>
> Verified,  I've started getting it consistently this morning after few
> more PTransform.
>
> 2016-07-06 12:27 GMT+02:00 Aljoscha Krettek <[email protected]>:
>
> Strange, and you're saying you only sometimes get this exception? Not
> reproducibly?
>
> On Wed, 6 Jul 2016 at 12:02 Pawel Szczur <[email protected]> wrote:
>
> I have 8 cores.
> Just IDE.
>
> 2016-07-06 12:00 GMT+02:00 Aljoscha Krettek <[email protected]>:
>
> Hi,
> are you running this in an IDE or on an actual cluster?
>
> -
> Aljoscha
>
> On Wed, 6 Jul 2016 at 11:57 Jean-Baptiste Onofré <[email protected]> wrote:
>
> Hi Pawel,
>
> I'm pretty sure that our Flink experts will answer.
>
> I'm assuming you are using a single JVM for Flink, right ?
> I think it could be related to Flink StreamExecutionEnvironment and the
> number of core on your machine.
> More your machine has cores, more you should increase the numberOfBuffers.
>
> Regards
> JB
>
> On 07/06/2016 11:26 AM, Pawel Szczur wrote:
> > When running my simple pipeline from time to time I'm getting below
> > exception:
> >
> > Caused by: java.io.IOException: Insufficient number of network buffers:
> > required 1, but only 0 available. The total number of network buffers is
> > currently set to 2048. You can increase this number by setting the
> > configuration key 'taskmanager.network.numberOfBuffers'.
> > at
> > org.apache.flink.runtime.io
> .network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:196)
> > at
> > org.apache.flink.runtime.io
> .network.NetworkEnvironment.registerTask(NetworkEnvironment.java:298)
> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:469)
> > at java.lang.Thread.run(Thread.java:745)
> >
> >
> > I'm using trunk of Beam with FlinkRunner. I guess it's well known Flink
> > problem? Idea how to prevent it?
> >
> > Pawel
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
>
>
>
>
>
>

Re: [FlinkRunner] Exception "Insufficient number of network buffers" thrown from time to time.

Reply via email to