Hello Storm development community. I'm new to this list.
I'd like to start a discussion, or join one if it's already started: Defining a 
storm (or trident) topology with a declarative definition instead of custom 
java code.
I've worked with several different storm topologies here at Yahoo. They've all 
been moderately complex, with multiple custom spouts/bolts, each requiring lots 
of configuration. Early on I realized that one of the more painful parts of 
managing a topology is the topology builder. Yes, if you have a relatively 
fixed topology, writing a bunch of java to set everything up isn't too bad. But 
we need to handle variations in our topology (smoke test, staging, production) 
in the normal case, but also often want to add or remove components for various 
ad-hoc tests.  Tweaking the topology builder for each case was painful.
I decided it would be really helpful to specify a topology in a declarative 
form that could be easily edited. Hence, the not very creatively-named 
"TopoLoader." We use that for a number of storm topologies in Yahoo. With 
TopoLoader, you define your topology in a YAML file that defines all of the 
spouts and bolts of your topology, any configuration parameters needed for 
each, and how they all connect together.  TopoLoader provides the "main" in 
your jar. When you do a storm submit, TopoLoader reads the YAML, builds the 
topology and submits it to storm.
Yes, you need to provide constructors or builders that know how to configure 
each of your modules from the YAML content, but that's relatively painless.
And a colleague of mine has extended TopoLoader to deal with trident 
topologies. 

I'd be interested in contributing this to the storm project. Being new, I'm 
totally ignorant of the process.  If there's interest in this, I'd like to get 
things rolling. Pointers on where to start would be appreciated.
I'm looking forward to a discussion.
(Maybe a first step would be to upload the document about how all of this 
works.)

David Willcox, Yahoo

Reply via email to