Thanks for all your input. In order to wrap the discussion up I'd like to
summarize the mentioned points:

The problem of increasing build times and complexity of the project has
been acknowledged. Ideally we would have everything in one repository using
an incremental build tool. Since Maven does not properly support this we
would have to switch our build tool to something like Gradle, for example.

Another option is introducing build profiles for different sets of modules
as well as separating integration and unit tests. The third alternative
would be creating sub-projects with their own repositories. I actually
think that these two proposal are not necessarily exclusive and it would
also make sense to have a separation between unit and integration tests if
we split the respository.

The overall consensus seems to be that we don't want to split the community
and want to keep everything under the same umbrella. I think this is the
right way to go, because otherwise some parts of the project could become
second class citizens. Given that and that we continue using Maven, I still
think that creating sub-projects for the libraries, for example, could be
beneficial. A split could reduce the project's complexity and make it
potentially easier for libraries to get actively developed. The main
concern is setting up the build infrastructure to aggregate docs from
multiple repositories and making them publicly available.

Since I started this thread and I would really like to see Flink's ML
library being revived again, I'd volunteer investigating first whether it
is doable establishing a proper incremental build for Flink. If that should
not be possible, I will look into splitting the repository, first only for
the libraries. I'll share my results with the community once I'm done with
the investigation.

Cheers,
Till

On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <rmetz...@apache.org> wrote:

> @Jin Mingjian: You can not use the paid travis version for open source
> projects. It only works for private repositories (at least back then when
> we've asked them about that).
>
> @Stephan: I don't think that incremental builds will be available with
> Maven anytime soon.
>
> I agree that we need to fix the build time issue on Travis. I've recently
> pushed a commit to use now three instead of two test groups.
> But I don't think that this is feasible long-term solution.
>
> If this discussion is only about reducing the build and test time,
> introducing build profiles for different components as Aljoscha suggested
> would solve the problem Till mentioned.
> Also, if we decide that travis is not a good tool anymore for the testing,
> I guess we can find a different solution. There are now competitors to
> Travis that might be willing to offer a paid plan for an open source
> project, or we set up our own infra on a server sponsored by one of the
> contributing companies.
> If we want to solve "community issues" with the change as well, then I
> think its work the effort of splitting up Flink into different
> repositories.
>
> Splitting up repositories is not a trivial task in my opinion. As others
> have mentioned before, we need to consider the following things:
> - How are we doing to build the documentation? Ideally every repo should
> contain its docs, so we would need to pull them together when building the
> main docs.
> - How do organize the dependencies? If we have library repository depend on
> snapshot Flink versions, we need to make sure that the snapshot deployment
> always works. This also means that people working on a library repository
> will pull from snapshot OR need to build first locally.
> - We need to update the release scripts
>
> If we commit to do these changes, we need to assign at least one committer
> (yes, in this case we need somebody who can commit, for example for
> updating the buildbot stuff) who volunteers to do the change.
> I've done a lot of infrastructure work in the past, but I'm currently
> pretty booked with many other things, so I don't realistically see myself
> doing that. Max who used to work on these things is taking some time off.
> I think we need, best case 3 days for the change, worst case 5 days. The
> problem is that there are no "unit tests" for the infra stuff, so many
> things are "trial and error" (like Apache's buildbot, our release scripts,
> the doc scripts, maven stuff, nightly builds).
>
>
>
> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <se...@apache.org> wrote:
>
> > If we can get a incremental builds to work, that would actually be the
> > preferred solution in my opinion.
> >
> > Many companies have invested heavily in making a "single repository" code
> > base work, because it has the advantage of not having to update/publish
> > several repositories first.
> > However, the strong prerequisite for that is an incremental build system
> > that builds only (fine grained) what it has to build. I am not sure how
> we
> > could make that work
> > with Maven and Travis...
> >
> > On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <c...@greghogan.com> wrote:
> >
> > > An additional option for reducing time to build and test is parallel
> > > execution. This would help users more than on TravisCI since we're
> > > generally running on multi-core machines rather than VM slices.
> > >
> > > Is the idea that each user would only check out the modules that he or
> > she
> > > is developing with? For example, if a developer is not working on
> > > flink-mesos or flink-yarn then the "flink-deploy" module would not be
> > clone
> > > to their filesystem?
> > >
> > > We can run a TravisCI nightly build on each repo to validate against
> API
> > > changes.
> > >
> > > Greg
> > >
> > > On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <fhue...@gmail.com>
> > wrote:
> > >
> > > > Hi everybody,
> > > >
> > > > I think this should be a discussion about the benefits and drawbacks
> of
> > > > separating the code into distinct repositories from a development
> point
> > > of
> > > > view.
> > > > So I agree with Stephan that we should not divide the community by
> > > creating
> > > > separate groups of committers.
> > > > Also the discussion about independent releases is not be strictly
> > related
> > > > to the decision, IMO.
> > > >
> > > > I see a few pros and cons for splitting the code base into separate
> > > > repositories which (I think) haven't been mentioned before:
> > > > pros:
> > > > - IDE setup will be leaner. It is not necessary to compile the whole
> > code
> > > > base to run a test after switching a branch.
> > > > cons:
> > > > - developing libraries features that require changes in the core /
> APIs
> > > > become more time consuming due to back-and-forth between code bases.
> > > > However, I think this is not very often the case.
> > > >
> > > > Aljoscha has good points as well. Many of the build issues could be
> > > solved
> > > > by different build profiles and configurations.
> > > >
> > > > Best, Fabian
> > > >
> > > > 2017-02-22 14:59 GMT+01:00 Gábor Hermann <m...@gaborhermann.com>:
> > > >
> > > > > @Stephan:
> > > > >
> > > > > Although I tried to raise some issues about splitting committers,
> I'm
> > > > > still strongly in favor of some kind of restructuring. We just have
> > to
> > > be
> > > > > conscious about the disadvantages.
> > > > >
> > > > > Not splitting the committers could leave the libraries in the same
> > > > > stalling status, described by Till. Of course, dedicating current
> > > > > committers as shepherds of the libraries could easily resolve the
> > > issue.
> > > > > But that requires time from current committers. It seems like
> > > trade-offs
> > > > > between code quality, speed of development, and committer efforts.
> > > > >
> > > > > From what I see in the discussion about ML, there are many people
> > > willing
> > > > > to contribute as well as production use-cases. This means we could
> > and
> > > > > should move forward. However, the development speed is
> significantly
> > > > slowed
> > > > > down by stalling PRs. The proposal for contributors helping the
> > review
> > > > > process did not really work out so far. In my opinion, either code
> > > > quality
> > > > > (by more easily accepting new committers) or some committer time
> > > > > (reviewing/merging) should be sacrificed to move forward. As Till
> has
> > > > > indicated, it would be shameful if we let this contribution effort
> > die.
> > > > >
> > > > > Cheers,
> > > > > Gabor
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to