We don't use Kafka as a broker between storm components; it's an input from
another service. That said you can get over 100k operations / second easily
from kafka, and I'm sure if you scaled your topic to a higher number of
partitions you could eclipse that.  In the application that I work on Kafka
is not the bottleneck, and we haven't really had to tune it that much.

On Sun, Oct 19, 2014 at 9:34 AM, Klausen Schaefersinho <
[email protected]> wrote:

> Hi,
>
> I am not a big fan of large topologies. The simple reason is, that it
> increases complexity and makes it difficult to decouple components. Also
> you can not simply deploy  or undeploy an aspect ( sub topology). So the
> approach of using an extern broker seems to be more appealing.
>
> However, does any body has experience with a broker based setup? What is
> the performance penalty?
>
>
> Cheers,
>
> Klaus
>
> On Sun, Oct 19, 2014 at 3:27 PM, Nathan Leung <[email protected]> wrote:
>
>> I agree with Jungtaek, the preferred approach is to either merge the
>> topologies or use a broker such as Kafka.
>> On Oct 19, 2014 12:12 AM, "임정택" <[email protected]> wrote:
>>
>>> How about merging topologies into one?
>>> Though tuple timeout should be set to max processing time into all of
>>> topologies, there's only way to work without adding other components.
>>>
>>> Btw, ideally supporting pub-sub between topology seems great, but AFAIK
>>> there're many hurdles to realize.
>>> 1. Subscribing spout should replay tuple when failure occurs
>>> (by Guarantee message processing), but publishing bolt can't help to do it.
>>> 2. Spout should have feature to receive from bolt (by TCP), which isn't
>>> exist yet.
>>> 3. Spout retrieves data from data source when nextTuple() occurs, which
>>> may can't applied to pub-sub situation.
>>> 4. Pub-sub spouts/bolts should allow task registration dynamically
>>> (maybe it's already exist)
>>>
>>> So I also recommend adding message queue(kafka, rabbitmq, etc.) between
>>> topologies.
>>>
>>> Please correct me if I'm wrong.
>>>
>>> Regards.
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>> 2014년 10월 17일 금요일, Klausen Schaefersinho<[email protected]>님이
>>> 작성한 메시지:
>>>
>>>> Hi,
>>>>
>>>> In my storm setup data arrives in form of files that I have to read and
>>>> emit in my spout. Also my topology is very dynamic. Some topologies run
>>>> quite long, whereas other can turned on and off frequently. In order to
>>>> avoid that I have n spouts reading from the files, I was wondering if could
>>>> have just one topology in the cluster which reads from the file and just
>>>> emits tuples? All other topologies would than register and that "listen" to
>>>> taht topology.
>>>>
>>>> Cheers,
>>>>
>>>> Klaus
>>>>
>>>
>>>
>>> --
>>> Name : 임 정택
>>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>> Twitter : http://twitter.com/heartsavior
>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>
>>>
>

Reply via email to