With the 1.1.0 release nearing completion, I’d like to turn our attention to 2.0 and develop a plan for what features, etc. to include.
The following 3 are what I feel are the minimum for a 2.0 release. These could likely be resolved relatively quickly: * Performance — I’ve not benchmarked the master branch vs. 1.0.x or 1.1.x in a while, but I feel it will be important to make sure there are no performance regressions, and would hope that we actually have a performance improvement over previous versions. To that end (e.g. if there is in fact a performance regression), the proposals that Roshan Naik put together for revising the threading and execution model (STORM-2307) and replacing Disruptor with JCTools (STORM-2306) warrant review and consideration. See also STORM-2284 which is the parent JIRA. * Finish porting Storm UI to java (STORM-1311) * Finish porting log viewer to java (STORM-1280) The following are items that are nice to have in 2.0, but I don’t feel are absolutely necessary for an initial 2.0 release: * Beam Runner (I wouldn’t tie this to 2.0, mentioning it because it’s relevant) — Initially there seemed to be a lot of interest in this, but that seems to have trailed off. I spoke with some Beam developers and there seems to be interest from that community as well. Do we want to move that effort to the Beam community, or keep it here? Moving it to the Beam community might lead to better collaboration between projects. * Bounded Spouts (needed for Beam Runner implementation) — Currently spouts are unbounded, there no end to the stream. Beam has the concept of bounded sources (roughly analogous to batch processing). To support that, we would need to implement a similar concept in Storm. One benefit of such a feature would be the ability to handle both bounded and unbounded workflows in Storm. * Storm-SQL — Jungtaek/Xin: You have been the primary drivers behind this effort. What improvements do you envision for 2.0? * Metrics V2 (STORM-2153: Coda Hale Metrics) — I’ve been targeting this for 1.2.0, but it’s designed to be easily portable to master/2.0. * JStorm Migration — Original outline can be found here [1]. Note a lot of the associated JIRAs below are assigned, but there hasn’t been any recent activity or pull requests, we should probably consider them unassigned and up for grabs.: * Worker Classloader Isolation (STORM-1338) — Lack of this has been the bane of a lot of Storm users almost since day one. We have largely addressed it by shading/relocating dependencies. It would be great to see this addressed once and for all. * JStorm back pressure implementation (STORM-1324) — The current back pressure implementation leaves a bit to be desired, and the JStorm approach looks promising, though it also depends on the JStorm concept of “topology master” (STORM-1323), which may have some implications regarding security. * Dynamic Topology Updates (STORM-1335) — This would provide a command to update topology jars and configuration without stopping the topology, and is well suited to leverage the blobstore. The restart command (that can also update the topology configuration) also looks compelling (STORM-1334). * Additional Scheduler Implementations (STORM-1320) * Additional Grouping Implementations (STORM-1328) As always I’m open to any opinions and suggestions. -Taylor [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109