Hi All,

I am trying to write an article comparing minimal HA(Highly available)
deployments of different streaming processing systems.

Basically, the question is if an organization has a limited workload, such
as 10k events per second, which might grow in the future, what is the
minimal setup they can use to run a highly available Stream Processor?

Could someone help answer following questions?

   1. How many nodes minimal Apache Flink HA setup needs? As I understood
   from [2], it is zookeeper nodes + 2 job managers without YARN and 1 job
   manager with YARN + worker nodes? Is this correct?
   2. As per [1], Zookeeper needs minimal 3 nodes to provide HA. Is there a
   way to run Apache Flink without HA?
   3. If someone runs Apache Flink without HA, but use state snapshots, how
   fast it can recover after a failure? ( ballpark figure)

Thanks
Srinath


   1.
   
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html#Deploying_ZooKeeper
   2.
   
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html#standalone-cluster-high-availability


-- 
============================
Srinath Perera, Ph.D.
   http://people.apache.org/~hemapani/
   http://srinathsview.blogspot.com/

Reply via email to