In my suggestion, there is no hacking around with source sets at all. There is no significant logic in the gradle configuration, which is a goal IMO.
On Sun, Sep 8, 2019 at 3:15 AM Maximilian Michels <[email protected]> wrote: > One minor addition, the "*1.5_to_1.7_utils" directory you listed, is > already present as the "1.5" directory. Admittedly, it could be renamed to > make that clearer. > > * > On 08.09.19 10:06, Maximilian Michels wrote: > > Thanks for your thoughts, David! > > > > Thanks Luke, copybara looks interesting. > > > > > Thomas and Max expressed needs for addressing the issue with the > > > current release. > > > > I think there is a misunderstanding. Thomas and I expressed the need to > > review the PR properly before merging it to avoid introducing technical > > debt. The expectation to open a feature PR a couple days before the > > release cut and have it, a) reviewed, b) tested, and c) merged, is a lot > > to ask for the maintainers. We are currently in stage (a) after the > > first round of reviews. > > > > 1.5/1.6 and possibly 1.7 will be removed. This was announced earlier on > > the mailing list and is tracked here: > > https://jira.apache.org/jira/browse/BEAM-7962 > > > > The amount of duplicated files code is very little, two very simple > > files. After removing 1.5/1.6/1.7 there should be none left. In the PR > > it looks like there were unnecessary duplications of files. That's why I > > commented. > > > > Due to the above reasons, I don't see the need to change the current way > > of cross-compiling for Flink versions. Out of David proposals, I still > > think the current approach is the easiest and least error-prone. > > > > Kenn, I appreciate your thoughts. How is your approach different from > > the current, expect introducing an additional factory method? The source > > layout is effectively identical to the current. We still have to retain > > different source directories for each Flink version where Flink internal > > interfaces change. > > > > > Now the code indicates exactly what is going on and all builds are > > > straightforward. All dependencies are actual build dependencies, not > > > src dir hacks. > > > > Source directories are build dependencies. I can see a minor improvement > > though in code readability (and that's always great), but still the > > developer needs to know from which source directory the factory is going > > to be loaded from, which is very similar to the current approach. > > > > Thanks, > > Max > > > > On 07.09.19 20:37, Kenneth Knowles wrote: > > > What about... > > > > > > *5) Architect the code for sharing using straightforward builds and > > > dependencies* > > > > > > Each versioned directory has an implementation of FlinkRunner > > > constructed from the common source, with version-specific pieces > > > injected (not with a complex framework, but by just passing as a > > > parameter). > > > > > > I just read over the actual amount of stuff you would have to inject. > > > It is very small at least through 1.8. Basically: > > > > > > FlinkRunnerBuilder( > > > Function<Coder, TypeSerializer> coderTypeSerializerFactory, > > > Function<ExecutionConfig, TypeSerializer> > > > encodedValueTypeSerializerFactory).build() // returns a FlinkRunner > > > > > > Now the code indicates exactly what is going on and all builds are > > > straightforward. All dependencies are actual build dependencies, not > > > src dir hacks. > > > > > > *flink/* > > > + *common/ *# move src/ to common/src/ for hygiene; compiles against > > > all supported Flink versions > > > - build.gradle > > > + *src/* > > > - org/apache/beam/runners/flink/common*/FlinkRunnerBuilder.java* > > > **- ... # the main runner implementation here > > > + *1.5_to_1.7_utils/*/ # common pluggable parts that work for 1.5 > > > through 1.7/ > > > - build.gradle # maybe makes sense to cross-compile and unit test > > > against 1.5 through 1.7, though dirs below give coverage too > > > + *src/* > > > > - org/apache/beam/runners/flink/v1_5_to_1_7/*CoderTypeSerializerFactory.java* > > > > > > > - > org/apache/beam/runners/flink/v1_5_to_1_7/*EncodedValueTypeSerializerFactory.java* > > > > > > + *1.5/* > > > - build.gradle > > > + *src/ */# implementation of FlinkRunner compatible with 1.5, > > > could have some of its own logic plugged in to FlinkRunnerBuilder, and > > > some 1.5_to_1.7 utils/* > > > * > > > - org/apache/beam/runners/flink/*FlinkRunner.java */# > > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new > > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/ > > > + *1.6/* > > > - build.gradle > > > + *src/ */# implementation of FlinkRunner compatible with 1.6, > > > actually has none of its own logic but it could/* > > > * > > > - org/apache/beam/runners/flink/*FlinkRunner.java */# > > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new > > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/ > > > + *1.7/* > > > + *src/ */# implementation of FlinkRunner compatible with 1.7, > > > actually has none of its own logic but it could/* > > > * > > > - org/apache/beam/runners/flink/*FlinkRunner.java */# > > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new > > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/ > > > + *1.8/* > > > - build.gradle > > > + *src/* > > > - org/apache/beam/runners/flink/*FlinkRunner.java */# > > > FlinkRunnerBuilder(new v1_8_CoderTypeSerializerFactory(), new > > > EncodedValueTypeSerializerFactory())/ > > > - org/apache/beam/runners/flink/*v1_8/CoderTypeSerializerFactory.java* > > > > - org/apache/beam/runners/flink/*v1_8/EncodedValueTypeSerializerFactory.java* > > > > > > > > > Kenn > > > > > > On Sat, Sep 7, 2019 at 9:00 AM Lukasz Cwik <[email protected] > > > <mailto:[email protected]>> wrote: > > > > > > When we import the Beam code into Google, we also run into issues > > > where sometimes we need to transform parts of the code. During > > > import we use copybara[1] to do these transformations to the source > > > which are more then just copy file X from some other path since > most > > > of the time we want to change only a few lines and this really > helps > > > reduce the maintenance pain. Unfortunately I don't see a Gradle > > > plugin for copybara but I do imagine there is a plugin that allows > > > one to run SED like expressions or other transformations instead of > > > just maintaining duplicate copies of files. > > > > > > 1: https://github.com/google/copybara > > > > > > On Sat, Sep 7, 2019 at 3:37 AM David Morávek <[email protected] > > > <mailto:[email protected]>> wrote: > > > > > > Hello, > > > > > > we currently have an opened PR for Flink 1.9 > > > <https://github.com/apache/beam/pull/9296>, which greatly > > > improves the runner for batch use-case. In case the PR gets > > > merged, we would be supporting 5 latest major versions of > Flink, > > > which obviously come with high maintenance price and makes > > > future development harder (there are already a sub-optimal > parts > > > due to compatibility with previous versions). Thomas and Max > > > expressed needs for addressing the issue with the current > > > release. > > > > > > Let's break down possible solution for the problem. > > > > > > *1) Current solution* > > > * > > > * > > > Currently we maintain separate build for each version. The > > > project structure looks as follows: > > > > > > *flink/* > > > + *1.5/ > > > * > > > + *src/*/# implementation of classes that differ between > > > versions/* > > > * > > > - build.gradle > > > + *1.6/* > > > + build.gradle #/the version is backward compatible, so it > > > can reuse "overrides" from 1.5/ > > > + *1.7/* > > > + build.gradle #/the version is backward compatible, so it > > > can reuse "overrides" from 1.5/ > > > + *1.8/* > > > + *src/ */# implementation of classes that differ between > > > versions/ > > > - build.gradle > > > + *1.9/* > > > + *src///*/# implementation of classes that differ between > > > versions/ > > > - build.gradle > > > + *src/*/# common source, shared among runner versions > > > / > > > - flink_runner.gradle/# included by each > <version>/build.gradle > > > / > > > > > > The problem with this structure is, that we always need to copy > > > all of the version specific classes between backward > > > incompatible versions, which results in *duplicate files* (we > > > can not simply override a single file, because it wouldn't > > > compile due to duplicate classes). > > > * > > > * > > > *2) Symlink duplicates* > > > * > > > * > > > Maybe we can simply symlink duplicates between versions and > only > > > override the files that need to be changed?* > > > * > > > > > > *3) Adjusting the gradle build* > > > * > > > * > > > Currently a version build looks something like this (this one > is > > > for 1.7.x version): > > > * > > > * > > > project.ext { > > > // Set the version of all Flink-related dependencies here. > > > flink_version = '1.7.2' > > > // Main source directory and Flink version specific code. > > > main_source_dirs = ["$basePath/src/main/java", > > > "../1.5/src/main/java"] > > > test_source_dirs = ["$basePath/src/test/java", > > > "../1.5/src/test/java"] > > > main_resources_dirs = ["$basePath/src/main/resources"] > > > test_resources_dirs = ["$basePath/src/test/resources"] > > > archives_base_name = 'beam-runners-flink-1.7' > > > } > > > > > > // Load the main build script which contains all build logic. > > > apply from: "$basePath/flink_runner.gradle" > > > > > > It basically says, take the common source and append version > > > specific implementations from 1.5 version. Let's say we want to > > > override a single file for 1.8. We need to copy everything from > > > 1.5/src and the build file would look as follows: > > > > > > /* All properties required for loading the Flink build script > */ > > > project.ext { > > > // Set the version of all Flink-related dependencies here. > > > flink_version = '1.8.0' > > > // Main source directory and Flink version specific code. > > > main_source_dirs = ["$basePath/src/main/java", > > > "./src/main/java"] > > > test_source_dirs = ["$basePath/src/test/java", > > > "./src/test/java"] > > > main_resources_dirs = ["$basePath/src/main/resources"] > > > test_resources_dirs = ["$basePath/src/test/resources"] > > > archives_base_name = 'beam-runners-flink-1.8' > > > } > > > > > > // Load the main build script which contains all build logic. > > > apply from: "$basePath/flink_runner.gradle" > > > > > > For simplicity, let's only focus on *main_source_dirs*. What we > > > really want to do is to tell the build, to use everything from > > > 1.5 and override a single class (e.g. CoderTypeSerializer). > > > > > > def copyOverrides = tasks.register('copyOverrides', Copy) { > > > it.from '../1.5/src/', './src' > > > it.into "${project.buildDir}/flink-overrides/src" > > > it.duplicatesStrategy DuplicatesStrategy.INCLUDE // The last > > > duplicate file 'wins'. > > > } > > > > > > compileJava.dependsOn copyOverrides > > > > > > projext.ext { > > > main_source_dirs = ["$basePath/src/main/java", > > > "${project.buildDir}/flink-overrides/src/main/java"] > > > } > > > > > > This would copy all overrides into build directory, and it case > > > of duplicate it picks the latest one. Than the build would > > > simple compile classes from the newly created java files in > > > build directory. > > > > > > *4) Maintaining last 3 major versions only* > > > > > > I recall that Flink community only supports 3 latest major > > > versions <https://flink.apache.org/downloads.html> (please > > > correct me if I'm mistaken). I suggest the the*Beam would do > the > > > same*. There is already an opened BEAM-7962 > > > <https://jira.apache.org/jira/browse/BEAM-7962> that suggest > > > dropping 1.5 & 1.6 versions. Maybe this would allow us to keep > > > the current structure with bearable amount of technical debt? > > > > > > Personally I'm in favor of *4)* combined with *3)*. > > > > > > What do you think? Do you have any other suggestions how to > > > solve this? > > > > > > Thanks, > > > D. > > > > >
