+1 This should improve build times a lot. It would be great if vendored
deps could stay in the main repository.

D.

On Tue, Oct 23, 2018 at 12:21 PM Maximilian Michels <[email protected]> wrote:

> Looks great, Kenn!
>
> > Max: what is the story behind having a separate flink-shaded repo? Did
> that make it easier to manage in some way?
>
> Better separation of concerns, but I don't think releasing the shaded
> artifacts from the main repo is a problem. I'd even prefer not to split
> up the repo because it makes updates to the vendored dependencies
> slightly easier.
>
> On 23.10.18 03:25, Kenneth Knowles wrote:
> > OK, I've filed https://issues.apache.org/jira/browse/BEAM-5819 to
> > collect sub-tasks. This has enough upsides throughout lots of areas of
> > the project that even though it is not glamorous it seems pretty
> > valuable to start on immediately. And I want to find out if there's a
> > pitfall lurking.
> >
> > Max: what is the story behind having a separate flink-shaded repo? Did
> > that make it easier to manage in some way?
> >
> > Kenn
> >
> > On Mon, Oct 22, 2018 at 2:55 AM Maximilian Michels <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     +1 for publishing vendored Jars independently. It will improve build
> >     time and ease IntelliJ integration.
> >
> >     Flink also publishes shaded dependencies separately:
> >
> >     - https://github.com/apache/flink-shaded
> >     - https://issues.apache.org/jira/browse/FLINK-6529
> >
> >     AFAIK their main motivation was to get rid of duplicate shaded
> classes
> >     on the classpath. We don't appear to have that problem because we
> >     already have a separate "vendor" project.
> >
> >      >  - With shading, it is hard (impossible?) to step into dependency
> >     code in IntelliJ's debugger, because the actual symbol at runtime
> >     does not match what is in the external jars
> >
> >     This would be solved by releasing the sources of the shaded jars.
> >      From a
> >     legal perspective, this could be problematic as alluded to here:
> >     https://github.com/apache/flink-shaded/issues/25
> >
> >     -Max
> >
> >     On 20.10.18 01:11, Lukasz Cwik wrote:
> >      > I have tried several times to improve the build system and
> intellij
> >      > integration and each attempt ended with little progress when
> dealing
> >      > with vendored code. My latest attempt has been the most promising
> >     where
> >      > I take the vendored classes/jars and decompile them generating the
> >      > source that Intellij can then use. I have a branch[1] that
> >     demonstrates
> >      > the idea. It works pretty well (and up until a change where we
> >     started
> >      > vendoring gRPC, was impractical to do. Instructions to try it out
> >     are:
> >      >
> >      > // Clean up any remnants of prior builds/intellij projects
> >      > git clean -fdx
> >      > // Generated the source for vendored/shaded modules
> >      > ./gradlew decompile
> >      >
> >      > // Remove the "generated" Java sources for protos so they don't
> >     conflict with the decompiled sources.
> >      > rm -rf model/pipeline/build/generated/source/proto
> >      > rm -rf model/job-management/build/generated/source/proto
> >      > rm -rf model/fn-execution/build/generated/source/proto
> >      > // Import the project into Intellij, most code completion now
> >     works still some issues with a few classes.
> >      > // Note that the Java decompiler doesn't generate valid source so
> >     still need to delegate to Gradle for build/run/test actions
> >      > // Other decompilers may do a better/worse job but haven't tried
> >     them.
> >      >
> >      >
> >      > The problems that I face are that the generated Java source from
> the
> >      > protos and the decompiled source from the compiled version of that
> >      > source post shading are both being imported as content roots and
> >     then
> >      > conflict. Also, the CFR decompiler isn't producing valid source,
> if
> >      > people could try others and report their mileage, we may find one
> >     that
> >      > works and then we would be able to use intellij to build/run our
> >     code
> >      > and not need to delegate all our build/run/test actions to Gradle.
> >      >
> >      > After all these attempts I have done, vendoring the dependencies
> >     outside
> >      > of the project seems like a sane approach and unless someone
> >     wants to
> >      > take a stab at the best progress I have made above, I would go
> >     with what
> >      > Kenn is suggesting even though it will mean that we will need to
> >     perform
> >      > releases every time we want to change the version of one of our
> >     vendored
> >      > dependencies.
> >      >
> >      > 1: https://github.com/lukecwik/incubator-beam/tree/intellij
> >      >
> >      >
> >      > On Fri, Oct 19, 2018 at 10:43 AM Kenneth Knowles <[email protected]
> >     <mailto:[email protected]>
> >      > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >      >
> >      >     Another reason to push on this is to get build times down.
> >     Once only
> >      >     generated proto classes use the shadow plugin we'll cut the
> build
> >      >     time in ~half? And there is no reason to constantly re-vendor.
> >      >
> >      >     Kenn
> >      >
> >      >     On Fri, Oct 19, 2018 at 10:39 AM Kenneth Knowles
> >     <[email protected] <mailto:[email protected]>
> >      >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >      >
> >      >         Hi all,
> >      >
> >      >         A while ago we had pretty good consensus that we should
> >     vendor
> >      >         ("pre-shade") specific dependencies, and there's start on
> it
> >      >         here: https://github.com/apache/beam/tree/master/vendor
> >      >
> >      >         IntelliJ notes:
> >      >
> >      >           - With shading, it is hard (impossible?) to step into
> >      >         dependency code in IntelliJ's debugger, because the actual
> >      >         symbol at runtime does not match what is in the external
> jars
> >      >
> >      >
> >      > Intellij can step through the classes if they were published
> >     outside the
> >      > project since it can decompile them. The source won't be legible.
> >      > Decompiling the source as in my example effectively shows the
> >     same issue.
> >      >
> >      >           - With vendoring, if the vendored dependencies are part
> >     of the
> >      >         project then IntelliJ gets confused because it operates on
> >      >         source, not the produced jars
> >      >
> >      >
> >      > Yes, I tried several ways to get intellij to ignore the source
> >     and use
> >      > the output jars but no luck.
> >      >
> >      >         The second one has a quick fix for most cases*: don't
> >     make the
> >      >         vendored dep a subproject, but just separately build and
> >     publish
> >      >         it. Since a vendored dependency should change much more
> >      >         infrequently (or if we bake the version into the name,
> >     ~never)
> >      >         this means we publish once and save headache and build
> >     time forever.
> >      >
> >      >         WDYT? Have I overlooked something? How about we set up
> >     vendored
> >      >         versions of guava, protobuf, grpc, and publish them. We
> don't
> >      >         have to actually start using them yet, and can do it
> >     incrementally.
> >      >
> >      >
> >      > Currently we are relocating code depending on the version string.
> >     If the
> >      > major version is >= 1, we use only the major version within the
> >     package
> >      > string and rely on semantic versioning provided by the dependency
> >     to not
> >      > break people. If the major version is 0, we assume the dependency
> is
> >      > unstable and use the full version as part of the package string
> >     during
> >      > relocation.
> >      >
> >      > The downside of using the full version string for relocated
> packages:
> >      > 1) Users will end up with multiple copies of dependencies that
> >     differ
> >      > only by the minor or patch version increasing the size of their
> >     application.
> >      > 2) Bumping up the version of a dependency now requires the import
> >      > statement in all java files to be updated (not too difficult with
> >     some
> >      > sed/grep skills)
> >      >
> >      > The upside of using the full version string in the relocated
> package:
> >      > 1) We don't have to worry about whether a dependency maintains
> >     semantic
> >      > versioning which means our users won't have to worry about that
> >     either.
> >      > 2) This increases the odds that a user will load multiple slightly
> >      > different versions of the same dependency which is known to be
> >      > incompatible in certain situations (e.g. Netty 4.1.25 can't be on
> >     the
> >      > classpath with Netty 4.1.28 even though they are both shaded due
> to
> >      > issues of how JNI with tcnative works).
> >      >
> >      >
> >      >         (side note: what do other projects like Flink do?)
> >      >
> >      >         Kenn
> >      >
> >      >         *for generated proto classes, they need to be altered
> after
> >      >         being generated so shading happens there, but actually
> only
> >      >         relocation and the shared vendored dep should work
> >     elsewhere in
> >      >         the project
> >      >
> >      >
> >      > We could publish the protos and treat them as "external"
> >     dependencies
> >      > within the Java projects which would also remove this pain point.
> >
>

Reply via email to