Re: Heron Spouts Code

Ning Wang Wed, 16 Jan 2019 23:27:01 -0800

This is an option. I have a few concerns about it:
- There will be a lot of repos and it will be messy to manage and it might
be harder for users to find it. I am expecting at least more than ten
(different services times different languages).
- There will be some duplicated code such as build/release configs,
scripts. etc.


I think we should be able to achieve the first reason with a single repo.
Different spouts should likely be in different folders and they can evolve
separately.
The second reason is valid, but duplicated code is a side effect.
The third reason depends on building tool I feel. Bazel is powerful, but it
is just changing time by time. :(

Just my two cents.





On Wed, Jan 16, 2019 at 8:09 PM Simon Weng <[email protected]> wrote:

> Hi, all:
>
> Can it also be one of the options to even have separate repo for each type
> of spouts? The reasons it is worth considering are:
>
> 1. Allow each spout to evolve and release in different pace because each
> is technically driven by external source software. For example, the
> community may need different versions of the Kafka Spout to be compatible
> with their deployed Kafka cluster in production
> 2. Allow each spout project to use the de facto build tool that suits the
> external SDK best. This will help to minimize the learning curve for
> constributors who specialize in different source software stack
> 3. Simply the maintainence of the build and CI
>
> I’m not familiar with the capability of Bazel, so certainly I’m not
> against it. If it can help to achieve some of the above, I guess one single
> repo will also work then.
>
> SiMing
>
> On Wed, Jan 16, 2019 at 5:34 PM Ning Wang <[email protected]> wrote:
>
>> +Siming
>>
>> On Tue, Jan 15, 2019 at 11:35 PM Ning Wang <[email protected]> wrote:
>>
>>> Hi, all,
>>>
>>> A few of us (Spencer, Saikat, Siming, Karthik, Josh, Sree) discussed
>>> today in our general slack channel that we should have spouts code
>>> somewhere so that people can reuse them (spouts are highly reusable in
>>> general) and contribute improvements. This is just a recap of the idea and
>>> some updates.
>>>
>>> We have two options:
>>> 1. add a spouts/ dir in heron project.
>>> 2. create a new project in github.
>>>
>>> For option 1, it is easy to start. But the iteration and release will be
>>> coupled with Heron project itself. It is likely there will be quite some
>>> activities around spouts time by time when new spouts are added. Also,
>>> Heron itself is basically the engine itself plus APIs and tooling, while
>>> there could be quite some spouts in future with many new dependencies like
>>> Kafka, pubsub, neo4j and neptune, etc. It is debatable to have spout
>>> implementations in Heron project, and these extra dependencies could add
>>> some unnecessary complexity.
>>>
>>> For option 2, there will be some work up front. but it will be much
>>> easier to manage and evolve. And here will be less concerns about new
>>> spouts (in different languages) and dependencies because spouts are
>>> relatively independent to each other and we may generate artifacts per
>>> spout.
>>>
>>> Overall most people prefer option 2 for its cleanness.
>>>
>>> I talked with Twitter OSS team. They are happy to support the initiative
>>> and suggest us to check with Apache team and see what is the best process.
>>> First question is that should this new side project be under Apache or not?
>>> This might be a question to mentors. What do you think/suggest?
>>>
>>> Another topic being discussed is the build tool in case we decide to
>>> create a new side project. Maven is more mature for sure, but we will
>>> likely need multi language support so currently Bazel seems to be the
>>> winner (I personally vote for Bazel 1.0 because the backward compatibility
>>> has been bad so far).
>>>
>>> Any ideas or suggestions, please feel free to reply.
>>>
>>> Regards,
>>> --ning
>>>
>> --
> Sent from Gmail Mobile
>

Re: Heron Spouts Code

Reply via email to