Hi Taylor, I am very glad to see the interests in pushing forward Beam Storm runner.
However, I cannot convince myself the benefits of having one runner to support all. Beam have three types of users: pipeline writers, library writers, and runner implementers. I can see pros vs cons as followings: Pros: 1. For pipeline writers and library writers, I don't see any benefits because they are using Beam API directly. 2. For runner implementers: (I am not that familiar with the current similarities and differences of Storm and JStorm, maybe you can help me to fill it in.) Cons: For pipeline writers and library writers: 1. It means delay of the delivery. We already have a working prototype, and there are lots of JStorm users eagerly want a JStorm API. 2. "One runner to support all" may increase the complexity, and compromise the quality of the runner. >From my point of view, cons are clearly over pros unless I am missing something. Let's me know what you think. Thanks -- Pei On Tue, Apr 11, 2017 at 1:47 AM, P. Taylor Goetz <ptgo...@apache.org> wrote: > Note: cross-posting to dev@beam and dev@storm > > I’ve seen at least two threads on the dev@ list discussing the JStorm > runner and my hope is we can expand on that discussion and cross-pollinate > with the Storm/JStorm/Beam communities as well. > > A while back I created a very preliminary proof of concept of getting a > Storm Beam runner working [1]. That was mainly an exercise for me to > familiarize myself with the Beam API and discover what it would take to > develop a Beam runner on top of Storm. That code is way out of date (I was > targeting Beam’s HEAD before the 0.2.0 release, and a lot of changes have > since taken place) and didn’t really work as Jian Liu pointed out. It was a > start, that perhaps could be further built upon, or parts harvested, etc. I > don’t have any particular attachment to that code and wouldn’t be upset if > it were completely discarded in favor of a better or more extensible > implementation. > > What I would like to see, and I think this is a great opportunity to do > so, is a closer collaboration between the Apache Storm and JStorm > communities. For those who aren’t familiar with those projects’ > relationship, I’ll start with a little history… > > JStorm began at Alibaba as a fork of Storm (pre-Apache?) with Storm’s > Clojure code reimplemented in Java. The rationale behind that move was that > Alibaba had a large number of Java developers but very few who were > proficient with Clojure. Moving to pure Java made sense as it would expand > the base of potential contributors. > > In late 2015 Alibaba donated the JStorm codebase to the Apache Storm > project, and the Apache Storm PMC committed to converting its Clojure code > to Java in order to incorporate the code donation. At the time there was > one catch — Apache Storm had implemented comprehensive security features > such as Kerberos authentication/authorization and multi-tenancy in its > Clojure code, which greatly complicated the move to Java and incorporation > of the JStorm code. JStorm did not have the same security features. A > number of JStorm developers have also become Storm PMC members. > > Fast forward to today. The Storm community has completed the bulk of the > move to Java and the next major release (presumably 2.0, which is currently > under discussion) will be largely Java-based. We are now in a much better > position to begin incorporating JStorm’s features, as well as implementing > new features necessary to support the Beam API (such as support for bounded > pipelines, among other features). > > Having separate Apache Storm and JStorm beam runner implementations > doesn’t feel appropriate in my personal opinion, especially since both > projects have expressed an ongoing commitment to bringing JStorm’s > additional features, and just as important, community, to Apache Storm. > > One final note, when the Storm community initially discussed developing a > Beam runner, the general consensus was do so within the Storm repository. > My current thinking is that such an effort should take place within the > Beam community, not only since that is the development pattern followed by > other runner implementations (Flink, Apex, etc.), but also because it would > serve to increase collaboration between Apache projects (always a good > thing!). > > I would love to hear opinions from others in the Storm/JStorm/Beam > communities. > > -Taylor