I forgot to mention it also depends on the spark kafka connector you use. If it's receiver based, I recommend a dedicated zookeeper cluster because it is used to store offsets. If it's receiver less Zookeeper can be shared.
2017-03-03 9:29 GMT+01:00 Jörn Franke <jornfra...@gmail.com>: > I think this highly depends on the risk that you want to be exposed to. If > you have it on dedicated nodes there is less influence of other processes. > > I have seen both: on Hadoop nodes or dedicated. On Hadoop I would not > recommend to put it on data nodes/heavily utilized nodes. > > Zookeeper does not need many resources (if you do not abuse it) and you > may think about putting it on a dedicated small infrastructure of several > nodes. > > On 3 Mar 2017, at 09:15, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > > hi, > > In DEV, Kafka and ZooKeeper services can be co- located.on the same > physical hosts > > In Prod moving forward do we need to set up Zookeeper on its own cluster > not sharing with Hadoop cluster? Can these services be shared within the > Hadoop cluster? > > How best to set up Zookeeper that is needed for Kafka for use with Spark > Streaming? > > Thanks > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > >