Hi Chen,
Thanks for the reply. Yes we have a very low interval. But our jobs usually
have 10-20 independent Kafka topics. When we perform scaling we stop
reading from all. Do you think we can implement partial failover to
calculate Kafka Partition assignments? Or rather than sending an
interrupted
@Greg: Can you describe at what points the JobManager struggled heavily? I
would guess that it is at some point during deployment, that deployment
takes longer than you expected?
On Wed, Oct 21, 2015 at 10:14 AM, Maximilian Michels wrote:
> Hi Greg,
>
> It would be very interesting to do a profi
Hi Greg,
It would be very interesting to do a profiling of the job master to
see what it mostly spends time on. Did you run your experiments with
0.9.X or the 0.10-SNAPSHOT? Would be interesting to know if there is a
regression.
Best,
Max
On Wed, Oct 21, 2015 at 10:08 AM, Till Rohrmann wrote:
>
Hi Greg,
there is no official guide for running Flink on large clusters. As far as I
know, the cluster we used for the matrix factorization was the largest
cluster we've run a serious job on. Thus, it would be highly interesting to
understand what made the JobManager to slow down. At some point, t