Re: Hardware planning

2014-03-20 Thread Ray Rodriguez
I would say those specs are probably a bit much for zookeeper particularly the memory and SAS disks assuming your usage of zookeeper is consistent with doing many more reads than writes which is the typical zookeeper use case. The CPU and network interface seem about right but I would go with lowe

Re: Hardware planning

2014-03-19 Thread Otis Gospodnetic
Ray, We are, for SPM . On c1.medium instances, I believe, we have: * Jetty receiving tens of thousands of metrics per second (in batches, so the rate of HTTP requests is lower than that number_ * Kafka brokers * ZK instances So far we have not had issues with this. Knock

Re: Hardware planning

2014-03-17 Thread Carlile, Ken
OK, I understand. So for the Zookeeper cluster, can I go with something like: 3 x Dell R320: Single hexcore 2.5GHz Xeon, 32GB RAM, 4x10K 300GB SAS drives, 10GbE and if I do, can I drop the CPU specs on the broker machines to say, dual 6 cores? Or are we looking at something that is core bound

Re: Hardware planning

2014-03-15 Thread Ray Rodriguez
Imagine a situation where one of your nodes running a kafka broker and zookeeper node goes down. You now have to contend with two distributed systems that need to do leader election and consensus in the case of a zookeeper ensemble and partition rebalancing/repair in the case of a kafka cluster so

Re: Hardware planning

2014-03-15 Thread Carlile, Ken
I'd rather not purchase dedicated hardware for ZK if I don't absolutely have to, unless I can use it for multiple clusters (ie Kafka, HBase, other things that rely on ZK). Would adding more cores help with ZK on the same machine? Or is that just a waste of cores, considering that it's java under

Re: Hardware planning

2014-03-14 Thread Jun Rao
The spec looks reasonable. If you have other machines, it may be better to put ZK on its own machines. Thanks, Jun On Fri, Mar 14, 2014 at 10:52 AM, Carlile, Ken wrote: > Hi all, > > I'm looking at setting up a (small) Kafka cluster for streaming microscope > data to Spark-Streaming. > > The p