Hi all, I am setting up a Samza cluster for the first time, and am now at the point of deploying on EC2. Hopefully this is the correct place to ask a few newbie questions. I'm impressed and excited by what I've seen so far, eager to get going with a real deployment.
1. Does anyone have good or bad experiences to report in running Samza atop Ubuntu 14.04 LTS? (Versus 12.04.) 2. Any best practices to recommend in terms of setup on EC2? E.g. instance types to use, EBS volumes versus non-EBS, and so on. I've found several threads with conflicting opinions on all of this. Our current plan is... (a) Use EBS volumes, separating Zookeeper from Kafka. (b) Start with three m3.large instances to begin with and upgrade later as needed, since our initial data volume will be low (c) Kafka + Zookeeper + Yarn Node Manager on two worker nodes, and Kafka + Zookeeper + Yarn Resource Manager on the third node. Regards, osh Oshoma Momoh http://pcglab.com
