Hi, all: Can it also be one of the options to even have separate repo for each type of spouts? The reasons it is worth considering are:
1. Allow each spout to evolve and release in different pace because each is technically driven by external source software. For example, the community may need different versions of the Kafka Spout to be compatible with their deployed Kafka cluster in production 2. Allow each spout project to use the de facto build tool that suits the external SDK best. This will help to minimize the learning curve for constributors who specialize in different source software stack 3. Simply the maintainence of the build and CI I’m not familiar with the capability of Bazel, so certainly I’m not against it. If it can help to achieve some of the above, I guess one single repo will also work then. SiMing On Wed, Jan 16, 2019 at 5:34 PM Ning Wang <[email protected]> wrote: > +Siming > > On Tue, Jan 15, 2019 at 11:35 PM Ning Wang <[email protected]> wrote: > >> Hi, all, >> >> A few of us (Spencer, Saikat, Siming, Karthik, Josh, Sree) discussed >> today in our general slack channel that we should have spouts code >> somewhere so that people can reuse them (spouts are highly reusable in >> general) and contribute improvements. This is just a recap of the idea and >> some updates. >> >> We have two options: >> 1. add a spouts/ dir in heron project. >> 2. create a new project in github. >> >> For option 1, it is easy to start. But the iteration and release will be >> coupled with Heron project itself. It is likely there will be quite some >> activities around spouts time by time when new spouts are added. Also, >> Heron itself is basically the engine itself plus APIs and tooling, while >> there could be quite some spouts in future with many new dependencies like >> Kafka, pubsub, neo4j and neptune, etc. It is debatable to have spout >> implementations in Heron project, and these extra dependencies could add >> some unnecessary complexity. >> >> For option 2, there will be some work up front. but it will be much >> easier to manage and evolve. And here will be less concerns about new >> spouts (in different languages) and dependencies because spouts are >> relatively independent to each other and we may generate artifacts per >> spout. >> >> Overall most people prefer option 2 for its cleanness. >> >> I talked with Twitter OSS team. They are happy to support the initiative >> and suggest us to check with Apache team and see what is the best process. >> First question is that should this new side project be under Apache or not? >> This might be a question to mentors. What do you think/suggest? >> >> Another topic being discussed is the build tool in case we decide to >> create a new side project. Maven is more mature for sure, but we will >> likely need multi language support so currently Bazel seems to be the >> winner (I personally vote for Bazel 1.0 because the backward compatibility >> has been bad so far). >> >> Any ideas or suggestions, please feel free to reply. >> >> Regards, >> --ning >> > -- Sent from Gmail Mobile
