Re: Difference between KafkaSpout

Stig Rohde Døssing Sun, 20 May 2018 10:40:19 -0700

You need to pass the list of broker URLs to the KafkaConsumer when you
create the spout. For example, if you were to create your KafkaSpoutConfig
via this method
https://github.com/apache/storm/blob/fbeafdcbb1e4be1263b91be7ba75a15aa6e885a8/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpoutConfig.java#L195,
you could set the bootstrapServers parameter to
"my-kafka-url:9092,my-kafka-url:9093".


Internally we're passing that parameter to the "bootstrap.servers"
configuration parameter described in the Kafka documentation
https://kafka.apache.org/documentation/#newconsumerconfigs. As you might
note, the list you give is just a seed list, so you don't necessarily have
to give the full list of Kafka hosts. As long as one of the URLs contain a
live broker, the consumer will get the URLs of the rest of the cluster from
the first broker it contacts.

So for the example bootstrap.servers config I gave
("my-kafka-url:9092,my-kafka-url:9093"), as long as either 9092 or 9093 has
a running Kafka broker, the consumer will discover the URLs for the rest of
the Kafka cluster.

If you'd like to know more about how the KafkaConsumer handles broker
failures, I'll refer you to the kafka-users mailing list (
https://kafka.apache.org/contact), since they can likely give a better
explanation than I can.

2018-05-20 17:20 GMT+02:00 Pavel Sapozhnikov <[email protected]>:

> Thank you Stig for that explanation.
>
> How does the new Consumer handle failure of Kafka brokers? If I connect to
> one broker on 9092 and that brokers dies, how will it know to connect to
> the other on 9093? Is it because it knows that consumer group belongs to
> two brokers?
>
> Thanks
> Pavel
>
> On Sun, May 20, 2018 at 10:55 AM Stig Rohde Døssing <[email protected]>
> wrote:
>
>> The storm-kafka-client module is intended as the replacement for the
>> storm-kafka module. The storm-kafka module uses a Kafka client that is now
>> deprecated (https://github.com/apache/kafka/blob/trunk/core/src/
>> main/scala/kafka/consumer/SimpleConsumer.scala) under the hood.
>> Storm-kafka-client uses https://kafka.apache.org/0110/
>> javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
>> instead.
>>
>> If you're writing new code, you should use storm-kafka-client. We're
>> planning to remove storm-kafka in Storm 2.0.0, since the underlying client
>> has been deprecated and will probably be removed by Kafka at some point.
>>
>> 2018-05-20 16:37 GMT+02:00 Pavel Sapozhnikov <[email protected]
>> >:
>>
>>> I am sure this has probably been answered before somewhere but I can't
>>> find a correct answer.
>>>
>>> What is the difference between
>>>
>>> *org.apache.storm.kafka.spout.KafkaSpout*
>>>
>>> and
>>>
>>> *org.apache.storm.kafka.KafkaSpout*
>>>
>>> In my experience the first one uses *KafkaSpoutConfig *to connect to
>>> broker directly while the second one connects to Zookeeper and resolves to
>>> a leader broker.
>>>
>>> Correct me if I am wrong. Is one of the old way of doing things versus
>>> new way?
>>>
>>> If I have very simplistic two brokers connected to one zk and a topic
>>> say X replicated between the two, in the second way, I only have to connect
>>> to ZK but in the first way I have to connect to broker directly.
>>>
>>> What if I have two brokers as I mentioned above do I need two storm
>>> spouts then for the case of storm.kafka.KafkaSpout? If that's the case
>>> would I need to set something up to *not *process duplicate messages
>>> from two spouts?
>>>
>>> Again if I got something wrong or misunderstanding something please
>>> correct me if I am wrong.
>>>
>>
>>

Re: Difference between KafkaSpout

Reply via email to