You can explore creating a new groovy script for transformation every time and push it to a predefined location in all worker nodes and use it in a bolt to execute it based on some tuple value field. Spring can help execute the groovy script as bean too.
Regards Sai On Thu, Nov 5, 2015 at 7:05 AM, Santosh Pingale <[email protected]> wrote: > Apache Samza fits the requirements. You can give it a shot. But whatever > Nathan said can still cause a big toll on overall performance but that > depends on your DAG. > On Nov 5, 2015 6:25 PM, "Nathan Leung" <[email protected]> wrote: > >> Generally yes, the best case for output collector is passing a reference >> through some queues. However, it's harder to reason about the performance >> of a larger topology, and (assuming you use reliable messaging) your entire >> topology can be held up by one poorly performing bolt. >> On Nov 5, 2015 7:31 AM, "Crina Arsenie" <[email protected]> wrote: >> >>> Hello, >>> >>> I'm interested in this topic also. Thank you for your answer. >>> I didn't knew about Flux, maybe it could do the job for my case, i'll >>> take a lot at it. >>> I have a also question about performance, I assume that passing through >>> the output collector is faster than Kaka, what do you think ? >>> >>> Thank you, >>> >>> Crina >>> >>> 2015-11-05 12:36 GMT+01:00 Nathan Leung <[email protected]>: >>> >>>> It's not possible to combine several topologies into one, but it should >>>> be possible to write different tuple sinks such that you can configure each >>>> bolt to write to either the output collector or Kafka. Then it's just a >>>> matter of wiring and configuring your bolts differently. >>>> >>>> You can use something like flux ( >>>> http://storm.apache.org/documentation/flux.html) to change how your >>>> bolts are wired without having to rebuild your jar file every time. >>>> On Nov 5, 2015 4:17 AM, "Irina Alles" <[email protected]> wrote: >>>> >>>>> Hello, >>>>> >>>>> >>>>> >>>>> We are currently studying if storm would be appropriate as a part of >>>>> our monitoring system. >>>>> >>>>> We are measuring sensor data, we need to apply different >>>>> transformation steps and store it somewhere or send it further. >>>>> >>>>> This seems to be a basic use case for storm so far, let’s call this >>>>> setup topology A. >>>>> >>>>> >>>>> We would like to be able to add measured sensors and their >>>>> transformation steps (topology B) dynamically without requiring any system >>>>> downtime. These dynamic additions could happen frequently. It will happen >>>>> that bolts of topology B will require the output of certain bolts in A >>>>> >>>>> Is there a best practice to manage this in storm? >>>>> >>>>> >>>>> >>>>> After a certain runtime of the system we will have to manage several >>>>> topologies and we need to assure the communication between them. >>>>> >>>>> We thought about using Kafka for the communication between topologies, >>>>> but with the growing number of topologies this might not be the best >>>>> approach I suppose (‘best’ means in this case: easy to handle, avoiding >>>>> message overhead). >>>>> >>>>> Maybe it would be better to group some topologies to create a greater >>>>> one? How would one do this in storm (I’ve read about the swap feature, but >>>>> it doesn’t seem to be available yet)? >>>>> >>>>> Is there a better approach? >>>>> >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> Irina >>>>> >>>> >>> >>>
