Good analogy!

Sent from my iPhone

On Aug 14, 2014, at 7:36 PM, "Adaryl \"Bob\" Wakefield, MBA" <
[email protected]> wrote:

  Ah so Storm is the hospital and Kafka is the waiting room where everybody
queues up to be seen in turn yes?

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

 *From:* Justin Workman <[email protected]>
*Sent:* Thursday, August 14, 2014 7:47 PM
*To:* [email protected]
*Subject:* Re: Kafka + Storm

 If you are familiar with Weblogic or ActiveMQ, it is similar. Let's see if
I can explain, I am definitely not a subject matter expert on this.

Within Kafka you can create "queues", ie a webclicks queue. Your web
servers can then send click events to this queue in Kafka. The web servers,
or agent writing the events to this queue are referred to as the
"producer".  Each event, or message in Kafka is assigned an id.

On the other side there are "consumers", in storms case this would be the
storm Kafka spout, that can subscribe to this webclicks queue to consume
the messages that are in the queue. The consumer can consume a single
message from the queue, or a batch of messages, as storm does. The consumer
keeps track of the latest offset, Kafka message id, that it has consumed.
This way the next time the consumer checks to see if there are more
messages to consume it will ask for messages with a message id greater than
its last offset.

This helps with the reliability of the event stream and helps guarantee
that your events/message make it start to finish through your stream,
assuming the events get to Kafka ;)

Hope this helps and makes some sort of sense. Again, sent from my iPhone ;)

Justin

Sent from my iPhone

On Aug 14, 2014, at 6:28 PM, "Adaryl \"Bob\" Wakefield, MBA" <
[email protected]> wrote:

  I get your reasoning at a high level. I should have specified that I
wasn’t sure what Kafka does. I don’t have a hard software engineering
background. I know that Kafka is “a message queuing” system, but I don’t
really know what that means.

(I can’t believe you wrote all that from your iPhone....)
B.


 *From:* Justin Workman <[email protected]>
*Sent:* Thursday, August 14, 2014 7:22 PM
*To:* [email protected]
*Subject:* Re: Kafka + Storm

 Personally, we looked at several options, including writing our own storm
source. There are limited storm sources with community support out there.
For us, it boiled down to the following;

1) community support and what appeared to be a standard method. Storm has
now included the kafka source as a bundled component to storm. This made
the implementation much faster, because the code was done.
2) the durability (replication and clustering) of Kafka. We have a three
hour retention period on our queues, so if we need to do maintenance on
storm or deploy an updated topology, we don't need to stop or replay any
sources
3) the ability to have other tools attach to the Kafka queues to consume
the same events for other purposes.
4) to compliment point #1, it's easy to write to Kafka. So it was little
effort to start sending our desired data to Kafka.

These are our main reasons ( I'm sure there were more ). Each use case is
going to be different and Kafka might not be the best choice for everyone.
For us it made sense.

Justin

Sent from my iPhone

On Aug 14, 2014, at 6:08 PM, "Adaryl \"Bob\" Wakefield, MBA" <
[email protected]> wrote:

  Can someone tell me why people put Kafka in front of Storm? Can’t Storm
ingest messages without having Kafka in the middle?

B.

Reply via email to