+1 for separate artifacts I would request that we explicitly discuss and agree which dependencies we vendor though.
Not everything listed in the JIRA subtasks is currently relocated. Thomas On Tue, Oct 23, 2018 at 8:04 AM David Morávek <[email protected]> wrote: > +1 This should improve build times a lot. It would be great if vendored > deps could stay in the main repository. > > D. > > On Tue, Oct 23, 2018 at 12:21 PM Maximilian Michels <[email protected]> > wrote: > >> Looks great, Kenn! >> >> > Max: what is the story behind having a separate flink-shaded repo? Did >> that make it easier to manage in some way? >> >> Better separation of concerns, but I don't think releasing the shaded >> artifacts from the main repo is a problem. I'd even prefer not to split >> up the repo because it makes updates to the vendored dependencies >> slightly easier. >> >> On 23.10.18 03:25, Kenneth Knowles wrote: >> > OK, I've filed https://issues.apache.org/jira/browse/BEAM-5819 to >> > collect sub-tasks. This has enough upsides throughout lots of areas of >> > the project that even though it is not glamorous it seems pretty >> > valuable to start on immediately. And I want to find out if there's a >> > pitfall lurking. >> > >> > Max: what is the story behind having a separate flink-shaded repo? Did >> > that make it easier to manage in some way? >> > >> > Kenn >> > >> > On Mon, Oct 22, 2018 at 2:55 AM Maximilian Michels <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > +1 for publishing vendored Jars independently. It will improve build >> > time and ease IntelliJ integration. >> > >> > Flink also publishes shaded dependencies separately: >> > >> > - https://github.com/apache/flink-shaded >> > - https://issues.apache.org/jira/browse/FLINK-6529 >> > >> > AFAIK their main motivation was to get rid of duplicate shaded >> classes >> > on the classpath. We don't appear to have that problem because we >> > already have a separate "vendor" project. >> > >> > > - With shading, it is hard (impossible?) to step into dependency >> > code in IntelliJ's debugger, because the actual symbol at runtime >> > does not match what is in the external jars >> > >> > This would be solved by releasing the sources of the shaded jars. >> > From a >> > legal perspective, this could be problematic as alluded to here: >> > https://github.com/apache/flink-shaded/issues/25 >> > >> > -Max >> > >> > On 20.10.18 01:11, Lukasz Cwik wrote: >> > > I have tried several times to improve the build system and >> intellij >> > > integration and each attempt ended with little progress when >> dealing >> > > with vendored code. My latest attempt has been the most promising >> > where >> > > I take the vendored classes/jars and decompile them generating >> the >> > > source that Intellij can then use. I have a branch[1] that >> > demonstrates >> > > the idea. It works pretty well (and up until a change where we >> > started >> > > vendoring gRPC, was impractical to do. Instructions to try it out >> > are: >> > > >> > > // Clean up any remnants of prior builds/intellij projects >> > > git clean -fdx >> > > // Generated the source for vendored/shaded modules >> > > ./gradlew decompile >> > > >> > > // Remove the "generated" Java sources for protos so they don't >> > conflict with the decompiled sources. >> > > rm -rf model/pipeline/build/generated/source/proto >> > > rm -rf model/job-management/build/generated/source/proto >> > > rm -rf model/fn-execution/build/generated/source/proto >> > > // Import the project into Intellij, most code completion now >> > works still some issues with a few classes. >> > > // Note that the Java decompiler doesn't generate valid source so >> > still need to delegate to Gradle for build/run/test actions >> > > // Other decompilers may do a better/worse job but haven't tried >> > them. >> > > >> > > >> > > The problems that I face are that the generated Java source from >> the >> > > protos and the decompiled source from the compiled version of >> that >> > > source post shading are both being imported as content roots and >> > then >> > > conflict. Also, the CFR decompiler isn't producing valid source, >> if >> > > people could try others and report their mileage, we may find one >> > that >> > > works and then we would be able to use intellij to build/run our >> > code >> > > and not need to delegate all our build/run/test actions to >> Gradle. >> > > >> > > After all these attempts I have done, vendoring the dependencies >> > outside >> > > of the project seems like a sane approach and unless someone >> > wants to >> > > take a stab at the best progress I have made above, I would go >> > with what >> > > Kenn is suggesting even though it will mean that we will need to >> > perform >> > > releases every time we want to change the version of one of our >> > vendored >> > > dependencies. >> > > >> > > 1: https://github.com/lukecwik/incubator-beam/tree/intellij >> > > >> > > >> > > On Fri, Oct 19, 2018 at 10:43 AM Kenneth Knowles < >> [email protected] >> > <mailto:[email protected]> >> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: >> > > >> > > Another reason to push on this is to get build times down. >> > Once only >> > > generated proto classes use the shadow plugin we'll cut the >> build >> > > time in ~half? And there is no reason to constantly >> re-vendor. >> > > >> > > Kenn >> > > >> > > On Fri, Oct 19, 2018 at 10:39 AM Kenneth Knowles >> > <[email protected] <mailto:[email protected]> >> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: >> > > >> > > Hi all, >> > > >> > > A while ago we had pretty good consensus that we should >> > vendor >> > > ("pre-shade") specific dependencies, and there's start >> on it >> > > here: https://github.com/apache/beam/tree/master/vendor >> > > >> > > IntelliJ notes: >> > > >> > > - With shading, it is hard (impossible?) to step into >> > > dependency code in IntelliJ's debugger, because the >> actual >> > > symbol at runtime does not match what is in the external >> jars >> > > >> > > >> > > Intellij can step through the classes if they were published >> > outside the >> > > project since it can decompile them. The source won't be legible. >> > > Decompiling the source as in my example effectively shows the >> > same issue. >> > > >> > > - With vendoring, if the vendored dependencies are part >> > of the >> > > project then IntelliJ gets confused because it operates >> on >> > > source, not the produced jars >> > > >> > > >> > > Yes, I tried several ways to get intellij to ignore the source >> > and use >> > > the output jars but no luck. >> > > >> > > The second one has a quick fix for most cases*: don't >> > make the >> > > vendored dep a subproject, but just separately build and >> > publish >> > > it. Since a vendored dependency should change much more >> > > infrequently (or if we bake the version into the name, >> > ~never) >> > > this means we publish once and save headache and build >> > time forever. >> > > >> > > WDYT? Have I overlooked something? How about we set up >> > vendored >> > > versions of guava, protobuf, grpc, and publish them. We >> don't >> > > have to actually start using them yet, and can do it >> > incrementally. >> > > >> > > >> > > Currently we are relocating code depending on the version string. >> > If the >> > > major version is >= 1, we use only the major version within the >> > package >> > > string and rely on semantic versioning provided by the dependency >> > to not >> > > break people. If the major version is 0, we assume the >> dependency is >> > > unstable and use the full version as part of the package string >> > during >> > > relocation. >> > > >> > > The downside of using the full version string for relocated >> packages: >> > > 1) Users will end up with multiple copies of dependencies that >> > differ >> > > only by the minor or patch version increasing the size of their >> > application. >> > > 2) Bumping up the version of a dependency now requires the import >> > > statement in all java files to be updated (not too difficult with >> > some >> > > sed/grep skills) >> > > >> > > The upside of using the full version string in the relocated >> package: >> > > 1) We don't have to worry about whether a dependency maintains >> > semantic >> > > versioning which means our users won't have to worry about that >> > either. >> > > 2) This increases the odds that a user will load multiple >> slightly >> > > different versions of the same dependency which is known to be >> > > incompatible in certain situations (e.g. Netty 4.1.25 can't be on >> > the >> > > classpath with Netty 4.1.28 even though they are both shaded due >> to >> > > issues of how JNI with tcnative works). >> > > >> > > >> > > (side note: what do other projects like Flink do?) >> > > >> > > Kenn >> > > >> > > *for generated proto classes, they need to be altered >> after >> > > being generated so shading happens there, but actually >> only >> > > relocation and the shared vendored dep should work >> > elsewhere in >> > > the project >> > > >> > > >> > > We could publish the protos and treat them as "external" >> > dependencies >> > > within the Java projects which would also remove this pain point. >> > >> >
