Hello Storm development community. I'm new to this list. I'd like to start a discussion, or join one if it's already started: Defining a storm (or trident) topology with a declarative definition instead of custom java code. I've worked with several different storm topologies here at Yahoo. They've all been moderately complex, with multiple custom spouts/bolts, each requiring lots of configuration. Early on I realized that one of the more painful parts of managing a topology is the topology builder. Yes, if you have a relatively fixed topology, writing a bunch of java to set everything up isn't too bad. But we need to handle variations in our topology (smoke test, staging, production) in the normal case, but also often want to add or remove components for various ad-hoc tests. Tweaking the topology builder for each case was painful. I decided it would be really helpful to specify a topology in a declarative form that could be easily edited. Hence, the not very creatively-named "TopoLoader." We use that for a number of storm topologies in Yahoo. With TopoLoader, you define your topology in a YAML file that defines all of the spouts and bolts of your topology, any configuration parameters needed for each, and how they all connect together. TopoLoader provides the "main" in your jar. When you do a storm submit, TopoLoader reads the YAML, builds the topology and submits it to storm. Yes, you need to provide constructors or builders that know how to configure each of your modules from the YAML content, but that's relatively painless. And a colleague of mine has extended TopoLoader to deal with trident topologies.
I'd be interested in contributing this to the storm project. Being new, I'm totally ignorant of the process. If there's interest in this, I'd like to get things rolling. Pointers on where to start would be appreciated. I'm looking forward to a discussion. (Maybe a first step would be to upload the document about how all of this works.) David Willcox, Yahoo
