Re: Approach to Auto Scaling Flink Job

2019-05-16 Thread Rong Rong
Hi Anil, A typical Yarn Resource Manager setting consist of 2 RM nodes [1] for active/standby setup. FYI: We've also shared some practical experiences for the limitation of this setup, and potential redundant fail-save mechanisms in our latest talk[2] in this year's FlinkForward. Thanks, Rong

Re: Approach to Auto Scaling Flink Job

2019-05-16 Thread Anil
Thanks for the clarification Rong! As per my understanding, the Docker containers monitors the job Flink Job which are running in Yarn Cluster. Flink JM's have HA enabled. So there's a standby JM in case the JM fails and in case of TM failure, that TM will be re-deployed. All good. My concern is

Re: Approach to Auto Scaling Flink Job

2019-05-12 Thread Rong Rong
Hi Anil, The reason why we are using Docker is because internally we support Dockerized container for microservices. Ideally speaking this can be any external service running on something other than the actual YARN cluster you Flink application resides. Basically watchdog runs outside of the

Re: Approach to Auto Scaling Flink Job

2019-05-11 Thread Anil
Thanks Rong. FlinkForward talk was insightful. One more question, it's mentioned in the talk that the jobs are running on Yarn and are monitored by containers running on Docker. Can you explain why is Docker needed here. When we deploy job to Yarn, one Yarn container is already dedicated for Job

Re: Approach to Auto Scaling Flink Job

2019-05-08 Thread Rong Rong
Hi Anil, We have a presentation[1] that briefly discuss the higher level of the approach (via watchdog) in FlinkForward 2018. We are also restructuring the approach of our open-source AthenaX: Right now our internal implementation has diverged from the open-source for too long, it has been a

Re: Approach to Auto Scaling Flink Job

2019-05-08 Thread Anil
Thanks for the reply Rong. Can you please let me know the design for the auto-scaling part, if possible. Or guide me in the direction so that I could create this feature myself. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Approach to Auto Scaling Flink Job

2019-05-08 Thread Rong Rong
Hi Anil, Thanks for reporting the issue. I went through the code and I believe the auto-scaling functionality is still in our internal branch and has not been merged to the open-source branch yet. I will change the documentation accordingly. Thanks, Rong On Mon, May 6, 2019 at 9:54 PM Anil

Approach to Auto Scaling Flink Job

2019-05-06 Thread Anil
I'm using Uber Open Source project Athenax. As mentioned in it's docs[1] it supports `Auto scaling for AthenaX jobs`. I went through the source code on Github but didn't find the auto scaling part. Can someone aware of this project please point me in the right direction here. I'm using Flink's