Re: How to handle dynamic topologies

Crina Arsenie Fri, 06 Nov 2015 01:34:05 -0800

Thank you all for your inputs. @santosh I think we will continue using
Storm for the moment. And as I saw, Samza does almost the same but in a
different way and from their site is even written that "Samza is pretty
immature. " @saiprasad maybe we can try using groovy scripts yes!


Thanks again,
Crina

2015-11-05 17:39 GMT+01:00 saiprasad mishra <[email protected]>:

> You can explore creating a new groovy script for transformation every time
> and push it to a predefined location in all worker nodes and use it in a
> bolt to execute it based on some tuple value field. Spring can help execute
> the groovy script as bean too.
>
> Regards
> Sai
>
> On Thu, Nov 5, 2015 at 7:05 AM, Santosh Pingale <[email protected]>
> wrote:
>
>> Apache Samza fits the requirements. You can give it a shot. But whatever
>> Nathan said can still cause a big toll on overall performance but that
>> depends on your DAG.
>> On Nov 5, 2015 6:25 PM, "Nathan Leung" <[email protected]> wrote:
>>
>>> Generally yes, the best case for output collector is passing a reference
>>> through some queues.  However, it's harder to reason about the performance
>>> of a larger topology, and (assuming you use reliable messaging) your entire
>>> topology can be held up by one poorly performing bolt.
>>> On Nov 5, 2015 7:31 AM, "Crina Arsenie" <[email protected]> wrote:
>>>
>>>> Hello,
>>>>
>>>> I'm interested in this topic also. Thank you for your answer.
>>>> I didn't knew about Flux, maybe it could do the job for my case, i'll
>>>> take a lot at it.
>>>> I have a also question about performance, I assume that passing through
>>>> the output collector is faster than Kaka, what do you think ?
>>>>
>>>> Thank you,
>>>>
>>>> Crina
>>>>
>>>> 2015-11-05 12:36 GMT+01:00 Nathan Leung <[email protected]>:
>>>>
>>>>> It's not possible to combine several topologies into one, but it
>>>>> should be possible to write different tuple sinks such that you can
>>>>> configure each bolt to write to either the output collector or Kafka. Then
>>>>> it's just a matter of wiring and configuring your bolts differently.
>>>>>
>>>>> You can use something like flux (
>>>>> http://storm.apache.org/documentation/flux.html) to change how your
>>>>> bolts are wired without having to rebuild your jar file every time.
>>>>> On Nov 5, 2015 4:17 AM, "Irina Alles" <[email protected]> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>>
>>>>>>
>>>>>> We are currently studying if storm would be appropriate as a part of
>>>>>> our monitoring system.
>>>>>>
>>>>>> We are measuring sensor data, we need to apply different
>>>>>> transformation steps and store it somewhere or send it further.
>>>>>>
>>>>>> This seems to be a basic use case for storm so far, let’s call this
>>>>>> setup topology A.
>>>>>>
>>>>>>
>>>>>> We would like to be able to add measured sensors and their
>>>>>> transformation steps (topology B) dynamically without requiring any 
>>>>>> system
>>>>>> downtime. These dynamic additions could happen frequently. It will happen
>>>>>> that bolts of topology B will require the output of certain bolts in A
>>>>>>
>>>>>> Is there a best practice to manage this in storm?
>>>>>>
>>>>>>
>>>>>>
>>>>>> After a certain runtime of the system we will have to manage several
>>>>>> topologies and we need to assure the communication between them.
>>>>>>
>>>>>> We thought about using Kafka for the communication between
>>>>>> topologies, but with the growing number of topologies this might not be 
>>>>>> the
>>>>>> best approach I suppose (‘best’ means in this case: easy to handle,
>>>>>> avoiding message overhead).
>>>>>>
>>>>>> Maybe it would be better to group some topologies to create a greater
>>>>>> one? How would one do this in storm (I’ve read about the swap feature, 
>>>>>> but
>>>>>> it doesn’t seem to be available yet)?
>>>>>>
>>>>>> Is there a better approach?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Irina
>>>>>>
>>>>>
>>>>
>>>>
>

Re: How to handle dynamic topologies

Reply via email to