Hi Bill,
no worry, questions are the purpose of this mailing list.
The number network buffers is a parameter that needs to be scaled with your
setup. The reason for that is Flink's pipelined data transfer, which
requires a certain number of network buffers to be available at the same
time during
To clarify Š it's 64HT cores per node, 16 nodes each with 128GB. Well
actually I have 48 nodes Š but trying to limit it so we have a comparison
with Spark/MPI/MapReduce all at the same node count.
Thanks for the information.
--
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.
On 6/19/
PS: I've read your last email as 64 HT cores per machine. If it was in total
over the 16 nodes, you have to adjust my response accordingly. ;)
On 19 Jun 2015, at 16:42, Fabian Hueske wrote:
> Hi Bill,
>
> no worry, questions are the purpose of this mailing list.
>
> The number network buffers
Hey Bill!
On 19 Jun 2015, at 16:24, Bill Sparks wrote:
> Sorry for the post again. I guess I'm not understanding this…
Thanks for posting again, not sorry! ;-)
Regarding the broken link: where did you get this link? I think it should be
referring here:
http://ci.apache.org/projects/flink/f
Hi Bill,
You're right. Simply increasing the task manager slots doesn't do anything.
It is correct to set the parallelism to taskManagers*slots. Simply increase
the number of network buffers in the flink-conf.yaml, e.g. to 4096. In the
future, we will configure this setting dynamically.
Let us kn
Sorry for the post again. I guess I'm not understanding this…
The question is how to scale up/increase the execution of a problem. What I'm
trying to do, is get the best out of the available processors for a given node
count and compare this against spark, using KMeans.
For spark, one method