Re: How to handle dynamic topologies

saiprasad mishra Thu, 05 Nov 2015 08:40:36 -0800

You can explore creating a new groovy script for transformation every time
and push it to a predefined location in all worker nodes and use it in a
bolt to execute it based on some tuple value field. Spring can help execute
the groovy script as bean too.


Regards
Sai

On Thu, Nov 5, 2015 at 7:05 AM, Santosh Pingale <[email protected]>
wrote:

> Apache Samza fits the requirements. You can give it a shot. But whatever
> Nathan said can still cause a big toll on overall performance but that
> depends on your DAG.
> On Nov 5, 2015 6:25 PM, "Nathan Leung" <[email protected]> wrote:
>
>> Generally yes, the best case for output collector is passing a reference
>> through some queues.  However, it's harder to reason about the performance
>> of a larger topology, and (assuming you use reliable messaging) your entire
>> topology can be held up by one poorly performing bolt.
>> On Nov 5, 2015 7:31 AM, "Crina Arsenie" <[email protected]> wrote:
>>
>>> Hello,
>>>
>>> I'm interested in this topic also. Thank you for your answer.
>>> I didn't knew about Flux, maybe it could do the job for my case, i'll
>>> take a lot at it.
>>> I have a also question about performance, I assume that passing through
>>> the output collector is faster than Kaka, what do you think ?
>>>
>>> Thank you,
>>>
>>> Crina
>>>
>>> 2015-11-05 12:36 GMT+01:00 Nathan Leung <[email protected]>:
>>>
>>>> It's not possible to combine several topologies into one, but it should
>>>> be possible to write different tuple sinks such that you can configure each
>>>> bolt to write to either the output collector or Kafka. Then it's just a
>>>> matter of wiring and configuring your bolts differently.
>>>>
>>>> You can use something like flux (
>>>> http://storm.apache.org/documentation/flux.html) to change how your
>>>> bolts are wired without having to rebuild your jar file every time.
>>>> On Nov 5, 2015 4:17 AM, "Irina Alles" <[email protected]> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>>
>>>>>
>>>>> We are currently studying if storm would be appropriate as a part of
>>>>> our monitoring system.
>>>>>
>>>>> We are measuring sensor data, we need to apply different
>>>>> transformation steps and store it somewhere or send it further.
>>>>>
>>>>> This seems to be a basic use case for storm so far, let’s call this
>>>>> setup topology A.
>>>>>
>>>>>
>>>>> We would like to be able to add measured sensors and their
>>>>> transformation steps (topology B) dynamically without requiring any system
>>>>> downtime. These dynamic additions could happen frequently. It will happen
>>>>> that bolts of topology B will require the output of certain bolts in A
>>>>>
>>>>> Is there a best practice to manage this in storm?
>>>>>
>>>>>
>>>>>
>>>>> After a certain runtime of the system we will have to manage several
>>>>> topologies and we need to assure the communication between them.
>>>>>
>>>>> We thought about using Kafka for the communication between topologies,
>>>>> but with the growing number of topologies this might not be the best
>>>>> approach I suppose (‘best’ means in this case: easy to handle, avoiding
>>>>> message overhead).
>>>>>
>>>>> Maybe it would be better to group some topologies to create a greater
>>>>> one? How would one do this in storm (I’ve read about the swap feature, but
>>>>> it doesn’t seem to be available yet)?
>>>>>
>>>>> Is there a better approach?
>>>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Irina
>>>>>
>>>>
>>>
>>>

Re: How to handle dynamic topologies

Reply via email to