Re: what's the relationship between Zookeeper and Kafka ?

2016-09-14 Thread Jaikiran Pai
In addition to what Michael noted, this question has been asked a few 
times before too and here's one such previous discussion 
https://www.quora.com/What-is-the-actual-role-of-ZooKeeper-in-Kafka


-Jaikiran

On Wednesday 14 September 2016 03:50 AM, Michael Noll wrote:

Eric,

the latest versions of Kafka use ZooKeeper only on the side of the Kafka
brokers, i.e. the servers in a Kafka cluster.

Background:
In older versions of Kafka, the Kafka consumer API required client
applications (that would read from data Kafka) to also talk to ZK.  Why
would they need to do that:  because ZK was used, in the old Kafka consumer
API, to track which data records they had already consumed, to rewind
reading from Kafka in case of failures like client machine crashes, and so
on.  In other words, consumption-related metadata was managed in ZK.
However, no "actual" data was ever routed through ZK.

The latest versions of Kafka have an improved consumer API that no longer
needs to talk to ZK -- any information that was previously maintained in ZK
(by these client apps) is now stored directly in Kafka.

Going back to your Spark programs:  They are using these older consumer API
versions of Kafka that still require talking to ZooKeeper, hence the need
to set things like "zoo1:2181".


Does the kafka data actually get routed out of zookeeper before delivering
the payload onto Spark ?

This was never the case (old API vs. new API).  Otherwise this would have
been a significant bottleneck. :-)  Data has always been served through the
Kafka brokers only.

Hope this helps,
Michael





On Sat, Sep 10, 2016 at 4:22 PM, Valerio Bruno  wrote:


AFAIK Kafka uses Zookeeper to coordinate the Kafka clusters ( set of
brokers ).

Consumers usually connect Zookeeper to retrieve the list of brokers. Then
connect the  broker.

*Valerio*

On 10 September 2016 at 22:11, Eric Ho  wrote:


I notice that some Spark programs would contact something like

'zoo1:2181'

when trying to suck data out of Kafka.

Does the kafka data actually get routed out of zookeeper before

delivering

the payload onto Spark ?



--

-eric ho




--
*Valerio Bruno*





*+39 3383163406+45 2991720...@valeriobruno.it fax: +39
1782275656skype: valerio_brunohttp://www.valeriobruno.it
*





Re: what's the relationship between Zookeeper and Kafka ?

2016-09-13 Thread Michael Noll
Eric,

the latest versions of Kafka use ZooKeeper only on the side of the Kafka
brokers, i.e. the servers in a Kafka cluster.

Background:
In older versions of Kafka, the Kafka consumer API required client
applications (that would read from data Kafka) to also talk to ZK.  Why
would they need to do that:  because ZK was used, in the old Kafka consumer
API, to track which data records they had already consumed, to rewind
reading from Kafka in case of failures like client machine crashes, and so
on.  In other words, consumption-related metadata was managed in ZK.
However, no "actual" data was ever routed through ZK.

The latest versions of Kafka have an improved consumer API that no longer
needs to talk to ZK -- any information that was previously maintained in ZK
(by these client apps) is now stored directly in Kafka.

Going back to your Spark programs:  They are using these older consumer API
versions of Kafka that still require talking to ZooKeeper, hence the need
to set things like "zoo1:2181".

> Does the kafka data actually get routed out of zookeeper before delivering
> the payload onto Spark ?

This was never the case (old API vs. new API).  Otherwise this would have
been a significant bottleneck. :-)  Data has always been served through the
Kafka brokers only.

Hope this helps,
Michael





On Sat, Sep 10, 2016 at 4:22 PM, Valerio Bruno  wrote:

> AFAIK Kafka uses Zookeeper to coordinate the Kafka clusters ( set of
> brokers ).
>
> Consumers usually connect Zookeeper to retrieve the list of brokers. Then
> connect the  broker.
>
> *Valerio*
>
> On 10 September 2016 at 22:11, Eric Ho  wrote:
>
> > I notice that some Spark programs would contact something like
> 'zoo1:2181'
> > when trying to suck data out of Kafka.
> >
> > Does the kafka data actually get routed out of zookeeper before
> delivering
> > the payload onto Spark ?
> >
> >
> >
> > --
> >
> > -eric ho
> >
>
>
>
> --
> *Valerio Bruno*
>
>
>
>
>
> *+39 3383163406+45 2991720...@valeriobruno.it fax: +39
> 1782275656skype: valerio_brunohttp://www.valeriobruno.it
> *
>


Re: what's the relationship between Zookeeper and Kafka ?

2016-09-10 Thread Valerio Bruno
AFAIK Kafka uses Zookeeper to coordinate the Kafka clusters ( set of
brokers ).

Consumers usually connect Zookeeper to retrieve the list of brokers. Then
connect the  broker.

*Valerio*

On 10 September 2016 at 22:11, Eric Ho  wrote:

> I notice that some Spark programs would contact something like 'zoo1:2181'
> when trying to suck data out of Kafka.
>
> Does the kafka data actually get routed out of zookeeper before delivering
> the payload onto Spark ?
>
>
>
> --
>
> -eric ho
>



-- 
*Valerio Bruno*





*+39 3383163406+45 2991720...@valeriobruno.it fax: +39
1782275656skype: valerio_brunohttp://www.valeriobruno.it
*


what's the relationship between Zookeeper and Kafka ?

2016-09-10 Thread Eric Ho
I notice that some Spark programs would contact something like 'zoo1:2181'
when trying to suck data out of Kafka.

Does the kafka data actually get routed out of zookeeper before delivering
the payload onto Spark ?



-- 

-eric ho