In my suggestion, there is no hacking around with source sets at all. There
is no significant logic in the gradle configuration, which is a goal IMO.

On Sun, Sep 8, 2019 at 3:15 AM Maximilian Michels <[email protected]> wrote:

> One minor addition, the "*1.5_to_1.7_utils" directory you listed, is
> already present as the "1.5" directory. Admittedly, it could be renamed to
> make that clearer.
>
> *
> On 08.09.19 10:06, Maximilian Michels wrote:
> > Thanks for your thoughts, David!
> >
> > Thanks Luke, copybara looks interesting.
> >
> > > Thomas and Max expressed needs for addressing the issue with the
> > > current release.
> >
> > I think there is a misunderstanding. Thomas and I expressed the need to
> > review the PR properly before merging it to avoid introducing technical
> > debt. The expectation to open a feature PR a couple days before the
> > release cut and have it, a) reviewed, b) tested, and c) merged, is a lot
> > to ask for the maintainers. We are currently in stage (a) after the
> > first round of reviews.
> >
> > 1.5/1.6 and possibly 1.7 will be removed. This was announced earlier on
> > the mailing list and is tracked here:
> > https://jira.apache.org/jira/browse/BEAM-7962
> >
> > The amount of duplicated files code is very little, two very simple
> > files. After removing 1.5/1.6/1.7 there should be none left. In the PR
> > it looks like there were unnecessary duplications of files. That's why I
> > commented.
> >
> > Due to the above reasons, I don't see the need to change the current way
> > of cross-compiling for Flink versions. Out of David proposals, I still
> > think the current approach is the easiest and least error-prone.
> >
> > Kenn, I appreciate your thoughts. How is your approach different from
> > the current, expect introducing an additional factory method? The source
> > layout is effectively identical to the current. We still have to retain
> > different source directories for each Flink version where Flink internal
> > interfaces change.
> >
> > > Now the code indicates exactly what is going on and all builds are
> > > straightforward. All dependencies are actual build dependencies, not
> > > src dir hacks.
> >
> > Source directories are build dependencies. I can see a minor improvement
> > though in code readability (and that's always great), but still the
> > developer needs to know from which source directory the factory is going
> > to be loaded from, which is very similar to the current approach.
> >
> > Thanks,
> > Max
> >
> > On 07.09.19 20:37, Kenneth Knowles wrote:
> > > What about...
> > >
> > > *5) Architect the code for sharing using straightforward builds and
> > > dependencies*
> > >
> > > Each versioned directory has an implementation of FlinkRunner
> > > constructed from the common source, with version-specific pieces
> > > injected (not with a complex framework, but by just passing as a
> > > parameter).
> > >
> > > I just read over the actual amount of stuff you would have to inject.
> > > It is very small at least through 1.8. Basically:
> > >
> > >      FlinkRunnerBuilder(
> > >        Function<Coder, TypeSerializer> coderTypeSerializerFactory,
> > >        Function<ExecutionConfig, TypeSerializer>
> > > encodedValueTypeSerializerFactory).build() // returns a FlinkRunner
> > >
> > > Now the code indicates exactly what is going on and all builds are
> > > straightforward. All dependencies are actual build dependencies, not
> > > src dir hacks.
> > >
> > > *flink/*
> > > + *common/ *# move src/ to common/src/ for hygiene; compiles against
> > > all supported Flink versions
> > >     - build.gradle
> > >     + *src/*
> > >        - org/apache/beam/runners/flink/common*/FlinkRunnerBuilder.java*
> > > **- ... # the main runner implementation here
> > > + *1.5_to_1.7_utils/*/ # common pluggable parts that work for 1.5
> > > through 1.7/
> > >     - build.gradle # maybe makes sense to cross-compile and unit test
> > > against 1.5 through 1.7, though dirs below give coverage too
> > > + *src/*
> > >
> - org/apache/beam/runners/flink/v1_5_to_1_7/*CoderTypeSerializerFactory.java*
> > >
> > >
> - 
> org/apache/beam/runners/flink/v1_5_to_1_7/*EncodedValueTypeSerializerFactory.java*
> > >
> > > + *1.5/*
> > >     - build.gradle
> > >     + *src/ */# implementation of FlinkRunner compatible with 1.5,
> > > could have some of its own logic plugged in to FlinkRunnerBuilder, and
> > > some 1.5_to_1.7 utils/*
> > > *
> > >        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> > > + *1.6/*
> > >     - build.gradle
> > >     + *src/ */# implementation of FlinkRunner compatible with 1.6,
> > > actually has none of its own logic but it could/*
> > > *
> > >        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> > > + *1.7/*
> > >     + *src/ */# implementation of FlinkRunner compatible with 1.7,
> > > actually has none of its own logic but it could/*
> > > *
> > >        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> > > + *1.8/*
> > >     - build.gradle
> > > + *src/*
> > >        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> > > FlinkRunnerBuilder(new v1_8_CoderTypeSerializerFactory(), new
> > > EncodedValueTypeSerializerFactory())/
> > > - org/apache/beam/runners/flink/*v1_8/CoderTypeSerializerFactory.java*
> > >
> - org/apache/beam/runners/flink/*v1_8/EncodedValueTypeSerializerFactory.java*
> > >
> > >
> > > Kenn
> > >
> > > On Sat, Sep 7, 2019 at 9:00 AM Lukasz Cwik <[email protected]
> > > <mailto:[email protected]>> wrote:
> > >
> > >     When we import the Beam code into Google, we also run into issues
> > >     where sometimes we need to transform parts of the code. During
> > >     import we use copybara[1] to do these transformations to the source
> > >     which are more then just copy file X from some other path since
> most
> > >     of the time we want to change only a few lines and this really
> helps
> > >     reduce the maintenance pain. Unfortunately I don't see a Gradle
> > >     plugin for copybara but I do imagine there is a plugin that allows
> > >     one to run SED like expressions or other transformations instead of
> > >     just maintaining duplicate copies of files.
> > >
> > >     1: https://github.com/google/copybara
> > >
> > >     On Sat, Sep 7, 2019 at 3:37 AM David Morávek <[email protected]
> > >     <mailto:[email protected]>> wrote:
> > >
> > >         Hello,
> > >
> > >         we currently have an opened PR for Flink 1.9
> > >         <https://github.com/apache/beam/pull/9296>, which greatly
> > >         improves the runner for batch use-case. In case the PR gets
> > >         merged, we would be supporting 5 latest major versions of
> Flink,
> > >         which obviously come with high maintenance price and makes
> > >         future development harder (there are already a sub-optimal
> parts
> > >         due to compatibility with previous versions). Thomas and Max
> > >         expressed needs for addressing the issue with the current
> > > release.
> > >
> > >         Let's break down possible solution for the problem.
> > >
> > >         *1) Current solution*
> > >         *
> > >         *
> > >         Currently we maintain separate build for each version. The
> > >         project structure looks as follows:
> > >
> > >         *flink/*
> > >         + *1.5/
> > >         *
> > >             + *src/*/# implementation of classes that differ between
> > >         versions/*
> > >         *
> > >             - build.gradle
> > >         + *1.6/*
> > >             + build.gradle #/the version is backward compatible, so it
> > >         can reuse "overrides" from 1.5/
> > >         + *1.7/*
> > >             + build.gradle #/the version is backward compatible, so it
> > >         can reuse "overrides" from 1.5/
> > >         + *1.8/*
> > >         + *src/ */# implementation of classes that differ between
> > > versions/
> > >             - build.gradle
> > >         + *1.9/*
> > >         + *src///*/# implementation of classes that differ between
> > > versions/
> > >             - build.gradle
> > >         + *src/*/# common source, shared among runner versions
> > >         /
> > >         - flink_runner.gradle/# included by  each
> <version>/build.gradle
> > >         /
> > >
> > >         The problem with this structure is, that we always need to copy
> > >         all of the version specific classes between backward
> > >         incompatible versions, which results in *duplicate files* (we
> > >         can not simply override a single file, because it wouldn't
> > >         compile due to duplicate classes).
> > >         *
> > >         *
> > >         *2) Symlink duplicates*
> > >         *
> > >         *
> > >         Maybe we can simply symlink duplicates between versions and
> only
> > >         override the files that need to be changed?*
> > >         *
> > >
> > >         *3) Adjusting the gradle build*
> > >         *
> > >         *
> > >         Currently a version build looks something like this (this one
> is
> > >         for 1.7.x version):
> > >         *
> > >         *
> > >         project.ext {
> > >            // Set the version of all Flink-related dependencies here.
> > >            flink_version = '1.7.2'
> > >            // Main source directory and Flink version specific code.
> > >            main_source_dirs = ["$basePath/src/main/java",
> > >         "../1.5/src/main/java"]
> > >            test_source_dirs = ["$basePath/src/test/java",
> > >         "../1.5/src/test/java"]
> > >            main_resources_dirs = ["$basePath/src/main/resources"]
> > >            test_resources_dirs = ["$basePath/src/test/resources"]
> > >            archives_base_name = 'beam-runners-flink-1.7'
> > >         }
> > >
> > >         // Load the main build script which contains all build logic.
> > >         apply from: "$basePath/flink_runner.gradle"
> > >
> > >         It basically says, take the common source and append version
> > >         specific implementations from 1.5 version. Let's say we want to
> > >         override a single file for 1.8. We need to copy everything from
> > >         1.5/src and the build file would look as follows:
> > >
> > >         /* All properties required for loading the Flink build script
> */
> > >         project.ext {
> > >            // Set the version of all Flink-related dependencies here.
> > >            flink_version = '1.8.0'
> > >            // Main source directory and Flink version specific code.
> > >            main_source_dirs = ["$basePath/src/main/java",
> > > "./src/main/java"]
> > >            test_source_dirs = ["$basePath/src/test/java",
> > > "./src/test/java"]
> > >            main_resources_dirs = ["$basePath/src/main/resources"]
> > >            test_resources_dirs = ["$basePath/src/test/resources"]
> > >            archives_base_name = 'beam-runners-flink-1.8'
> > >         }
> > >
> > >         // Load the main build script which contains all build logic.
> > >         apply from: "$basePath/flink_runner.gradle"
> > >
> > >         For simplicity, let's only focus on *main_source_dirs*. What we
> > >         really want to do is to tell the build, to use everything from
> > >         1.5 and override a single class (e.g. CoderTypeSerializer).
> > >
> > >         def copyOverrides = tasks.register('copyOverrides', Copy) {
> > >            it.from '../1.5/src/', './src'
> > >            it.into "${project.buildDir}/flink-overrides/src"
> > >            it.duplicatesStrategy DuplicatesStrategy.INCLUDE // The last
> > >         duplicate file 'wins'.
> > >         }
> > >
> > >         compileJava.dependsOn copyOverrides
> > >
> > >         projext.ext {
> > >            main_source_dirs = ["$basePath/src/main/java",
> > >         "${project.buildDir}/flink-overrides/src/main/java"]
> > >         }
> > >
> > >         This would copy all overrides into build directory, and it case
> > >         of duplicate it picks the latest one. Than the build would
> > >         simple compile classes from the newly created java files in
> > >         build directory.
> > >
> > >         *4) Maintaining last 3 major versions only*
> > >
> > >         I recall that Flink community only supports 3 latest major
> > >         versions <https://flink.apache.org/downloads.html> (please
> > >         correct me if I'm mistaken). I suggest the the*Beam would do
> the
> > >         same*. There is already an opened BEAM-7962
> > >         <https://jira.apache.org/jira/browse/BEAM-7962> that suggest
> > >         dropping 1.5 & 1.6 versions. Maybe this would allow us to keep
> > >         the current structure with bearable amount of technical debt?
> > >
> > >         Personally I'm in favor of *4)* combined with *3)*.
> > >
> > >         What do you think? Do you have any other suggestions how to
> > >         solve this?
> > >
> > >         Thanks,
> > >         D.
> > >
>
>

Reply via email to