I forgot to mention it also depends on the spark kafka connector you use.
If it's receiver based, I recommend a dedicated zookeeper cluster because
it is used to store offsets. If it's receiver less Zookeeper can be shared.

2017-03-03 9:29 GMT+01:00 Jörn Franke <jornfra...@gmail.com>:

> I think this highly depends on the risk that you want to be exposed to. If
> you have it on dedicated nodes there is less influence of other processes.
>
> I have seen both: on Hadoop nodes or dedicated. On Hadoop I would not
> recommend to put it on data nodes/heavily utilized nodes.
>
> Zookeeper does not need many resources (if you do not abuse it) and you
> may think about putting it on a dedicated small infrastructure of several
> nodes.
>
> On 3 Mar 2017, at 09:15, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>
> hi,
>
> In DEV, Kafka and ZooKeeper services can be co- located.on the same
> physical hosts
>
> In Prod moving forward do we need to set up Zookeeper on its own cluster
> not sharing with Hadoop cluster? Can these services be shared within the
> Hadoop cluster?
>
> How best to set up Zookeeper that is needed for Kafka for use with Spark
> Streaming?
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>

Reply via email to