If you prefer not to use a separate main class for each topology you can try something like Flux: https://github.com/ptgoetz/flux. Note that it is a work in progress. You will still need to have a yaml document for each topology.
With regards to code hierarchy, I find it good to have common libraries in a separate project or sub project. Related topologies that are part of the same high level project can go in the same jar; this also lets you more easily reuse components like bolts with the declarative storm topology. Topologies that are less related or unrelated can go in a separate jar. On Apr 28, 2015 8:50 AM, "Sandon Jacobs" <[email protected]> wrote: > My company is using storm for various stream-processing solutions, mostly > ingesting data from Kafka topics. We have chosen to implement our > topologies in Scala, using APIs like Tormenta and Summingbird in the mix as > well. We have about 9-10 topologies running in production as we speak. > > I find tons of useful information about Storm in general, but VERY little > about how folks are managing the deployment, git repos, etc. > > Currently we have all of these topologies in the same GIT repo, with a > main-class for each topology, allowing us to run them locally or remotely. > Some of this code shares common components - we try to reuse some bolts we > have written, and other dependencies cross topologies as well. > > So in our CI environment, we build an assembly jar using SBT containing > all topologies and use storm jar command to deploy that jar N-times (N = > number of topologies). We have functional tests that are run by Jenkins > after each topology deployment to exercise the functionality of said > topology. Given the number of topologies in our catalog, this is starting > to become cumbersome in the current state, with the feedback loop from git > push thru deployment-test getting longer and more unwieldy. The whole thing > is starting to remind me too much of my Java EE container days with > multiple EAR files or WAR files deployed in a cluster of WebSphere boxes > (UGH!!!). > > I say all of that to frame the question of how folks are managing a > similar situations/deployments. There has been some thought around breaking > up the git repo into multiple repos. Or maybe a git repo with a parent SBT > project, with subproject(s) for common components and 1 subproject per > topology. > > I am interested to hear any thoughts or be pointed to any resources that > have been helpful to others.
