This is an option. I have a few concerns about it: - There will be a lot of repos and it will be messy to manage and it might be harder for users to find it. I am expecting at least more than ten (different services times different languages). - There will be some duplicated code such as build/release configs, scripts. etc.
I think we should be able to achieve the first reason with a single repo. Different spouts should likely be in different folders and they can evolve separately. The second reason is valid, but duplicated code is a side effect. The third reason depends on building tool I feel. Bazel is powerful, but it is just changing time by time. :( Just my two cents. On Wed, Jan 16, 2019 at 8:09 PM Simon Weng <siming.w...@gmail.com> wrote: > Hi, all: > > Can it also be one of the options to even have separate repo for each type > of spouts? The reasons it is worth considering are: > > 1. Allow each spout to evolve and release in different pace because each > is technically driven by external source software. For example, the > community may need different versions of the Kafka Spout to be compatible > with their deployed Kafka cluster in production > 2. Allow each spout project to use the de facto build tool that suits the > external SDK best. This will help to minimize the learning curve for > constributors who specialize in different source software stack > 3. Simply the maintainence of the build and CI > > I’m not familiar with the capability of Bazel, so certainly I’m not > against it. If it can help to achieve some of the above, I guess one single > repo will also work then. > > SiMing > > On Wed, Jan 16, 2019 at 5:34 PM Ning Wang <wangnin...@gmail.com> wrote: > >> +Siming >> >> On Tue, Jan 15, 2019 at 11:35 PM Ning Wang <wangnin...@gmail.com> wrote: >> >>> Hi, all, >>> >>> A few of us (Spencer, Saikat, Siming, Karthik, Josh, Sree) discussed >>> today in our general slack channel that we should have spouts code >>> somewhere so that people can reuse them (spouts are highly reusable in >>> general) and contribute improvements. This is just a recap of the idea and >>> some updates. >>> >>> We have two options: >>> 1. add a spouts/ dir in heron project. >>> 2. create a new project in github. >>> >>> For option 1, it is easy to start. But the iteration and release will be >>> coupled with Heron project itself. It is likely there will be quite some >>> activities around spouts time by time when new spouts are added. Also, >>> Heron itself is basically the engine itself plus APIs and tooling, while >>> there could be quite some spouts in future with many new dependencies like >>> Kafka, pubsub, neo4j and neptune, etc. It is debatable to have spout >>> implementations in Heron project, and these extra dependencies could add >>> some unnecessary complexity. >>> >>> For option 2, there will be some work up front. but it will be much >>> easier to manage and evolve. And here will be less concerns about new >>> spouts (in different languages) and dependencies because spouts are >>> relatively independent to each other and we may generate artifacts per >>> spout. >>> >>> Overall most people prefer option 2 for its cleanness. >>> >>> I talked with Twitter OSS team. They are happy to support the initiative >>> and suggest us to check with Apache team and see what is the best process. >>> First question is that should this new side project be under Apache or not? >>> This might be a question to mentors. What do you think/suggest? >>> >>> Another topic being discussed is the build tool in case we decide to >>> create a new side project. Maven is more mature for sure, but we will >>> likely need multi language support so currently Bazel seems to be the >>> winner (I personally vote for Bazel 1.0 because the backward compatibility >>> has been bad so far). >>> >>> Any ideas or suggestions, please feel free to reply. >>> >>> Regards, >>> --ning >>> >> -- > Sent from Gmail Mobile >