Hi Chris,

Thanks a lot for the information.

I am comparing Storm + Kafka0.8  vs  Samza.

As part of the initial use case I am using Kafka API to publish messages to the 
Kafka Broker say 40k, 50k...
Then using the Storm spout I am consuming these messages.
On the Samza side I am using a similar Kafka Consumer that consumes messages 
from Kafka.

Is this use case fine for comparing Storm + Kafka0.8  vs  Samza ? or do you 
think any other things that I need to consider?

Also going forward I am planning to find some use cases where I can compare 
Samza's  features like State Management, Partitioning and Parallelism, etc.
Any pointers towards these specific use cases?

Thanks,
-Nirmal

-----Original Message-----
From: Chris Riccomini [mailto:[email protected]]
Sent: Thursday, October 17, 2013 10:42 PM
To: [email protected]
Subject: Re: Writing a simple KafkaProducer in Samza

Hey Nirmal,

Glad to hear that the hello-samza project is working for you.

If I understand you correctly, I believe that you're saying you want to send 
messages to a Kafka topic, right? As you've pointed out, you can send messages 
to Kafka through Samza. You can also send messages to Kafka directly using the 
Kafka producer API. Which one you use depends on what you're code is doing.

With Samza, typically a task is sending messages to Kafka as a reaction to some 
other event (which triggers the process method). For example, in the Wiki 
example, we send a message to Kafka whenever an update happens on Wikipedia 
(via the IRC channel). In this example, we had to write a Wikipedia consumer 
for Samza, which implements the SystemConsumer API in Samza. This 
implementation reads messages from the Wikimedia IRC channel.
You can see this implementation here:


https://github.com/linkedin/hello-samza/blob/master/samza-wikipedia/src/mai
n/java/samza/examples/wikipedia/system/WikipediaConsumer.java

If you use Samza, you MUST have at least one task.input defined, which feeds 
messages to your StreamTask. Out of the box, Samza comes with a 
KafkaSystemConsumer implementation. The hello-samza project comes with the 
WikipediaSystemConsumer implementation. If you want to react to messages from 
another system, or feed, you'd have to implement this interface (and hopefully 
contribute it back :).

The alternative approach would be to just send messages directly to Kafka using 
the Kafka API. This approach is more appropriate in cases that don't fit well 
with Samza's processing model (e.g. you can't easily implement the 
SystemConsumer API, you need to guarantee deployment on a specific host all the 
time, etc). For example, if you wanted to read syslog messages on a specific 
host, and send them to Kafka, it probably makes more sense to just write a 
simple Java main() method that creates a Kafka producer, polls syslog 
periodically, and calls producer.send() whenever a new message appears in the 
syslog.

If you can be more specific about what you're doing, I can probably provide 
better advice.

Cheers,
Chris

On 10/17/13 6:39 AM, "Nirmal Kumar" <[email protected]> wrote:

>Hi All,
>
>I was referring the hello-samza project as was able to run it
>successfully.
>I was able to run all the jobs and also wrote a consumer task to listen
>to kafka.wikipedia-stats topic.
>
>I now want to write a Samza job that act as a KafkaProducer to
>continuously publishes simple string messages to a topic.
>Just like the WikipediaFeedStreamTask that reads Wikipedia events and
>publishes them to a topic.
>I am not sure of the any value of task.inputs in the config properties
>file?
>The way I think is like a java program publishing string messages to a
>kafka topic.
>How can I write such a Samza job?
>
>Any pointers would be of great help.
>
>Later on I want can read the same messages from a consumer like
>WikipediaParserStreamTask does.
>Referring the hello-samza project I was able to write a Consumer task
>that reads messages from the topic(kafka.wikipedia-stats) by simply
>task.class=samza.examples.wikipedia.task.TestConsumer
>task.inputs=kafka.wikipedia-stats
>
>
>Thanks,
>-Nirmal
>
>________________________________
>
>
>
>
>
>
>NOTE: This message may contain information that is confidential,
>proprietary, privileged or otherwise protected by law. The message is
>intended solely for the named addressee. If received in error, please
>destroy and notify the sender. Any use of this email is prohibited when
>received in error. Impetus does not represent, warrant and/or
>guarantee, that the integrity of this communication has been maintained
>nor that the communication is free of errors, virus, interception or 
>interference.


________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

Reply via email to