+1 for separate artifacts

I would request that we explicitly discuss and agree which dependencies we
vendor though.

Not everything listed in the JIRA subtasks is currently relocated.

Thomas


On Tue, Oct 23, 2018 at 8:04 AM David Morávek <[email protected]>
wrote:

> +1 This should improve build times a lot. It would be great if vendored
> deps could stay in the main repository.
>
> D.
>
> On Tue, Oct 23, 2018 at 12:21 PM Maximilian Michels <[email protected]>
> wrote:
>
>> Looks great, Kenn!
>>
>> > Max: what is the story behind having a separate flink-shaded repo? Did
>> that make it easier to manage in some way?
>>
>> Better separation of concerns, but I don't think releasing the shaded
>> artifacts from the main repo is a problem. I'd even prefer not to split
>> up the repo because it makes updates to the vendored dependencies
>> slightly easier.
>>
>> On 23.10.18 03:25, Kenneth Knowles wrote:
>> > OK, I've filed https://issues.apache.org/jira/browse/BEAM-5819 to
>> > collect sub-tasks. This has enough upsides throughout lots of areas of
>> > the project that even though it is not glamorous it seems pretty
>> > valuable to start on immediately. And I want to find out if there's a
>> > pitfall lurking.
>> >
>> > Max: what is the story behind having a separate flink-shaded repo? Did
>> > that make it easier to manage in some way?
>> >
>> > Kenn
>> >
>> > On Mon, Oct 22, 2018 at 2:55 AM Maximilian Michels <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >     +1 for publishing vendored Jars independently. It will improve build
>> >     time and ease IntelliJ integration.
>> >
>> >     Flink also publishes shaded dependencies separately:
>> >
>> >     - https://github.com/apache/flink-shaded
>> >     - https://issues.apache.org/jira/browse/FLINK-6529
>> >
>> >     AFAIK their main motivation was to get rid of duplicate shaded
>> classes
>> >     on the classpath. We don't appear to have that problem because we
>> >     already have a separate "vendor" project.
>> >
>> >      >  - With shading, it is hard (impossible?) to step into dependency
>> >     code in IntelliJ's debugger, because the actual symbol at runtime
>> >     does not match what is in the external jars
>> >
>> >     This would be solved by releasing the sources of the shaded jars.
>> >      From a
>> >     legal perspective, this could be problematic as alluded to here:
>> >     https://github.com/apache/flink-shaded/issues/25
>> >
>> >     -Max
>> >
>> >     On 20.10.18 01:11, Lukasz Cwik wrote:
>> >      > I have tried several times to improve the build system and
>> intellij
>> >      > integration and each attempt ended with little progress when
>> dealing
>> >      > with vendored code. My latest attempt has been the most promising
>> >     where
>> >      > I take the vendored classes/jars and decompile them generating
>> the
>> >      > source that Intellij can then use. I have a branch[1] that
>> >     demonstrates
>> >      > the idea. It works pretty well (and up until a change where we
>> >     started
>> >      > vendoring gRPC, was impractical to do. Instructions to try it out
>> >     are:
>> >      >
>> >      > // Clean up any remnants of prior builds/intellij projects
>> >      > git clean -fdx
>> >      > // Generated the source for vendored/shaded modules
>> >      > ./gradlew decompile
>> >      >
>> >      > // Remove the "generated" Java sources for protos so they don't
>> >     conflict with the decompiled sources.
>> >      > rm -rf model/pipeline/build/generated/source/proto
>> >      > rm -rf model/job-management/build/generated/source/proto
>> >      > rm -rf model/fn-execution/build/generated/source/proto
>> >      > // Import the project into Intellij, most code completion now
>> >     works still some issues with a few classes.
>> >      > // Note that the Java decompiler doesn't generate valid source so
>> >     still need to delegate to Gradle for build/run/test actions
>> >      > // Other decompilers may do a better/worse job but haven't tried
>> >     them.
>> >      >
>> >      >
>> >      > The problems that I face are that the generated Java source from
>> the
>> >      > protos and the decompiled source from the compiled version of
>> that
>> >      > source post shading are both being imported as content roots and
>> >     then
>> >      > conflict. Also, the CFR decompiler isn't producing valid source,
>> if
>> >      > people could try others and report their mileage, we may find one
>> >     that
>> >      > works and then we would be able to use intellij to build/run our
>> >     code
>> >      > and not need to delegate all our build/run/test actions to
>> Gradle.
>> >      >
>> >      > After all these attempts I have done, vendoring the dependencies
>> >     outside
>> >      > of the project seems like a sane approach and unless someone
>> >     wants to
>> >      > take a stab at the best progress I have made above, I would go
>> >     with what
>> >      > Kenn is suggesting even though it will mean that we will need to
>> >     perform
>> >      > releases every time we want to change the version of one of our
>> >     vendored
>> >      > dependencies.
>> >      >
>> >      > 1: https://github.com/lukecwik/incubator-beam/tree/intellij
>> >      >
>> >      >
>> >      > On Fri, Oct 19, 2018 at 10:43 AM Kenneth Knowles <
>> [email protected]
>> >     <mailto:[email protected]>
>> >      > <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >      >
>> >      >     Another reason to push on this is to get build times down.
>> >     Once only
>> >      >     generated proto classes use the shadow plugin we'll cut the
>> build
>> >      >     time in ~half? And there is no reason to constantly
>> re-vendor.
>> >      >
>> >      >     Kenn
>> >      >
>> >      >     On Fri, Oct 19, 2018 at 10:39 AM Kenneth Knowles
>> >     <[email protected] <mailto:[email protected]>
>> >      >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
>> >      >
>> >      >         Hi all,
>> >      >
>> >      >         A while ago we had pretty good consensus that we should
>> >     vendor
>> >      >         ("pre-shade") specific dependencies, and there's start
>> on it
>> >      >         here: https://github.com/apache/beam/tree/master/vendor
>> >      >
>> >      >         IntelliJ notes:
>> >      >
>> >      >           - With shading, it is hard (impossible?) to step into
>> >      >         dependency code in IntelliJ's debugger, because the
>> actual
>> >      >         symbol at runtime does not match what is in the external
>> jars
>> >      >
>> >      >
>> >      > Intellij can step through the classes if they were published
>> >     outside the
>> >      > project since it can decompile them. The source won't be legible.
>> >      > Decompiling the source as in my example effectively shows the
>> >     same issue.
>> >      >
>> >      >           - With vendoring, if the vendored dependencies are part
>> >     of the
>> >      >         project then IntelliJ gets confused because it operates
>> on
>> >      >         source, not the produced jars
>> >      >
>> >      >
>> >      > Yes, I tried several ways to get intellij to ignore the source
>> >     and use
>> >      > the output jars but no luck.
>> >      >
>> >      >         The second one has a quick fix for most cases*: don't
>> >     make the
>> >      >         vendored dep a subproject, but just separately build and
>> >     publish
>> >      >         it. Since a vendored dependency should change much more
>> >      >         infrequently (or if we bake the version into the name,
>> >     ~never)
>> >      >         this means we publish once and save headache and build
>> >     time forever.
>> >      >
>> >      >         WDYT? Have I overlooked something? How about we set up
>> >     vendored
>> >      >         versions of guava, protobuf, grpc, and publish them. We
>> don't
>> >      >         have to actually start using them yet, and can do it
>> >     incrementally.
>> >      >
>> >      >
>> >      > Currently we are relocating code depending on the version string.
>> >     If the
>> >      > major version is >= 1, we use only the major version within the
>> >     package
>> >      > string and rely on semantic versioning provided by the dependency
>> >     to not
>> >      > break people. If the major version is 0, we assume the
>> dependency is
>> >      > unstable and use the full version as part of the package string
>> >     during
>> >      > relocation.
>> >      >
>> >      > The downside of using the full version string for relocated
>> packages:
>> >      > 1) Users will end up with multiple copies of dependencies that
>> >     differ
>> >      > only by the minor or patch version increasing the size of their
>> >     application.
>> >      > 2) Bumping up the version of a dependency now requires the import
>> >      > statement in all java files to be updated (not too difficult with
>> >     some
>> >      > sed/grep skills)
>> >      >
>> >      > The upside of using the full version string in the relocated
>> package:
>> >      > 1) We don't have to worry about whether a dependency maintains
>> >     semantic
>> >      > versioning which means our users won't have to worry about that
>> >     either.
>> >      > 2) This increases the odds that a user will load multiple
>> slightly
>> >      > different versions of the same dependency which is known to be
>> >      > incompatible in certain situations (e.g. Netty 4.1.25 can't be on
>> >     the
>> >      > classpath with Netty 4.1.28 even though they are both shaded due
>> to
>> >      > issues of how JNI with tcnative works).
>> >      >
>> >      >
>> >      >         (side note: what do other projects like Flink do?)
>> >      >
>> >      >         Kenn
>> >      >
>> >      >         *for generated proto classes, they need to be altered
>> after
>> >      >         being generated so shading happens there, but actually
>> only
>> >      >         relocation and the shared vendored dep should work
>> >     elsewhere in
>> >      >         the project
>> >      >
>> >      >
>> >      > We could publish the protos and treat them as "external"
>> >     dependencies
>> >      > within the Java projects which would also remove this pain point.
>> >
>>
>

Reply via email to