Re: [DISCUSS] [Java] Private shaded dependency uber jars

Reuven Lax Fri, 02 Feb 2018 08:50:11 -0800

TypeToken is not trivial. I've written code to do what TypeToken does
before (figuring out generic ancestor types). It's actually somewhat
tricky, and the code I wrote had subtle bugs in it; eventually we removed
this code in favor of Guava's implementation :)


On Fri, Feb 2, 2018 at 7:47 AM, Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

> Yep, note I never said to reinvent the wheel, we can copy it from guava,
> openwebbeans or any other impl. Point was more to avoid to depend on
> something we don't own for that which is after all not that much code. I
> also think we can limit it a lot to align it on what is supported by beam
> (I'm thinking to coders here) but this can be another topic.
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-02 16:33 GMT+01:00 Kenneth Knowles <k...@google.com>:
>
>> On Fri, Feb 2, 2018 at 7:18 AM, Romain Manni-Bucau <rmannibu...@gmail.com
>> > wrote:
>>
>>> Don't forget beam doesn't support much behind it (mainly only a few
>>> ParameterizedType due the usage code path) so it is mainly only about
>>> handling parameterized types and typevariables recursively. Not a lot of
>>> work. Main concern being it is in the API so using a shade as an API is
>>> never a good idea, in particular cause the shade can be broken and requires
>>> to setup clirr or things like that and when it breaks you are done and need
>>> to fork it anyway. Limiting the dependencies for an API - as beam is - is
>>> always saner even if it requires a small fork of code.
>>>
>>
>> The main thing that TypeToken is used for is capturing generics that are
>> lost by Java reflection. It is a bit tricky, actually.
>>
>> Kenn
>>
>>
>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-02 15:49 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>>
>>>> On Fri, Feb 2, 2018 at 6:41 AM, Romain Manni-Bucau <
>>>> rmannibu...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2018-02-02 15:37 GMT+01:00 Kenneth Knowles <k...@google.com>:
>>>>>
>>>>>> Another couple:
>>>>>>
>>>>>>  - User-facing TypeDescriptor is a very thin wrapper on Guava's
>>>>>> TypeToken
>>>>>>
>>>>>
>>>>> Technically reflect Type is enough
>>>>>
>>>>
>>>> If you want to try to remove TypeToken from underneath TypeDescriptor,
>>>> I have no objections as long as you expand the test suite to cover all the
>>>> functionality where we trust TypeToken's tests. Good luck :-)
>>>>
>>>> Kenn
>>>>
>>>>
>>>>>
>>>>>
>>>>>>  - ImmutableList and friends and their builders are very widely used
>>>>>> and IMO still add a lot for readability and preventing someone coming 
>>>>>> along
>>>>>> and adding mistakes to a codebase
>>>>>>
>>>>>
>>>>> Sugar but not required. When you compare the cost of a shade versus of
>>>>> duplicating the parts we need there is no real match IMHO.
>>>>>
>>>>>
>>>>>>
>>>>>> So considering it all, I would keep a vendored Guava (but also move
>>>>>> off where we can, and also have our own improvements). I hope it will be 
>>>>>> a
>>>>>> near-empty build file to generate it so not a maintenance burden.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Thu, Feb 1, 2018 at 8:44 PM, Kenneth Knowles <k...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Nice. This sounds like a great idea (or two?) and goes along with
>>>>>>> what I just started for futures.
>>>>>>>
>>>>>>> Guava: filed https://issues.apache.org/jira/browse/BEAM-3606 and
>>>>>>> assigned to Ismaël :-) and converted my futures thing to a subtask.
>>>>>>>
>>>>>>> Specific things for our micro guava:
>>>>>>>
>>>>>>>  - checkArgumentNotNull can throw the right exception
>>>>>>>  - our own Optional because Java's is not Serializable
>>>>>>>  - futures combinators since many are missing, especially Java's
>>>>>>> don't do exceptions right
>>>>>>>
>>>>>>> Protobuf: didn't file an issue because I'm not sure
>>>>>>>
>>>>>>> I was wondering if pre-shading works. We really need to get rid of
>>>>>>> it from public APIs in a 100% reliable way. It is also a problem for
>>>>>>> Dataflow. I was wondering if one approach is to pre-shade
>>>>>>> gcpio-protobuf-java, gcpio-protobuf-java-util, gcpio-grpc-java, etc.
>>>>>>> Anything that needs to take a Message object. (and do the same for
>>>>>>> beam-model-protobuf-java since the model bits have to depend on each 
>>>>>>> other
>>>>>>> but nothing else).
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Thu, Feb 1, 2018 at 1:56 AM, Ismaël Mejía <ieme...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Huge +1 to get rid of Guava!
>>>>>>>>
>>>>>>>> This solves annoying dependency issues for some IOs and allow us to
>>>>>>>> get rid of the shading that makes current jars bigger than they
>>>>>>>> should
>>>>>>>> be.
>>>>>>>>
>>>>>>>> We can create our own 'micro guava' package with some classes for
>>>>>>>> things that are hard to migrate, or that we  prefer to still have
>>>>>>>> like
>>>>>>>> the check* methods for example. Given the size of the task we should
>>>>>>>> probably divide it into subtasks, more important is to get rid of it
>>>>>>>> for 'sdks/java/core'. We can then attack other areas afterwards.
>>>>>>>>
>>>>>>>> Other important idea would be to get rid of Protobuf in public APIs
>>>>>>>> like GCPIO and to better shade it from leaking into the runners. An
>>>>>>>> unexpected side effect of this is a leak of netty via gRPC/protobuf
>>>>>>>> that is byting us for the Spark runner, but well that's worth a
>>>>>>>> different discussion.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Feb 1, 2018 at 10:08 AM, Romain Manni-Bucau
>>>>>>>> <rmannibu...@gmail.com> wrote:
>>>>>>>> > a map of list is fine and not a challenge we'll face long I hope
>>>>>>>> ;)
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Romain Manni-Bucau
>>>>>>>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>>>>>>>> >
>>>>>>>> > 2018-02-01 9:40 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>> >>
>>>>>>>> >> Not sure we'll be able to replace them all. Things like guava
>>>>>>>> Table and
>>>>>>>> >> Multimap don't have great replacements in Java8.
>>>>>>>> >>
>>>>>>>> >> On Wed, Jan 31, 2018 at 10:11 PM, Jean-Baptiste Onofré <
>>>>>>>> j...@nanthrax.net>
>>>>>>>> >> wrote:
>>>>>>>> >>>
>>>>>>>> >>> +1, it was on my TODO for a while waiting the Java8 update.
>>>>>>>> >>>
>>>>>>>> >>> Regards
>>>>>>>> >>> JB
>>>>>>>> >>>
>>>>>>>> >>> On 02/01/2018 06:56 AM, Romain Manni-Bucau wrote:
>>>>>>>> >>> > Why not dropping guava for all beam codebase? With java 8 it
>>>>>>>> is quite
>>>>>>>> >>> > easy to do
>>>>>>>> >>> > it and avoid a bunch of conflicts. Did it in 2 projects with
>>>>>>>> quite a
>>>>>>>> >>> > good result.
>>>>>>>> >>> >
>>>>>>>> >>> > Le 1 févr. 2018 06:50, "Lukasz Cwik" <lc...@google.com
>>>>>>>> >>> > <mailto:lc...@google.com>> a écrit :
>>>>>>>> >>> >
>>>>>>>> >>> >     Make sure to include the guava version in the artifact
>>>>>>>> name so that
>>>>>>>> >>> > we can
>>>>>>>> >>> >     have multiple vendored versions.
>>>>>>>> >>> >
>>>>>>>> >>> >     On Wed, Jan 31, 2018 at 9:16 PM, Kenneth Knowles <
>>>>>>>> k...@google.com
>>>>>>>> >>> >     <mailto:k...@google.com>> wrote:
>>>>>>>> >>> >
>>>>>>>> >>> >         I didn't have time for this, but it just bit me. We
>>>>>>>> definitely
>>>>>>>> >>> > have
>>>>>>>> >>> >         Guava on the API surface of runner support code in
>>>>>>>> ways that
>>>>>>>> >>> > get
>>>>>>>> >>> >         incompatibly shaded. I will probably start "1a" by
>>>>>>>> making a
>>>>>>>> >>> > shaded
>>>>>>>> >>> >         library org.apache.beam:vendored-guava and starting
>>>>>>>> to use it.
>>>>>>>> >>> > It sounds
>>>>>>>> >>> >         like there is generally unanimous support for that
>>>>>>>> much,
>>>>>>>> >>> > anyhow.
>>>>>>>> >>> >
>>>>>>>> >>> >         Kenn
>>>>>>>> >>> >
>>>>>>>> >>> >         On Wed, Dec 13, 2017 at 8:31 AM, Aljoscha Krettek
>>>>>>>> >>> > <aljos...@apache.org
>>>>>>>> >>> >         <mailto:aljos...@apache.org>> wrote:
>>>>>>>> >>> >
>>>>>>>> >>> >             Thanks Ismaël for bringing up this discussion
>>>>>>>> again!
>>>>>>>> >>> >
>>>>>>>> >>> >             I would be in favour of 1) and more specifically
>>>>>>>> of 1a)
>>>>>>>> >>> >
>>>>>>>> >>> >             Aljoscha
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>> >>             On 12. Dec 2017, at 18:56, Lukasz Cwik <
>>>>>>>> lc...@google.com
>>>>>>>> >>> >>             <mailto:lc...@google.com>> wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>             You can always run tests on post shaded
>>>>>>>> artifacts instead
>>>>>>>> >>> >> of the
>>>>>>>> >>> >>             compiled classes, it just requires us to change
>>>>>>>> our maven
>>>>>>>> >>> >> surefire
>>>>>>>> >>> >>             / gradle test configurations but it is true that
>>>>>>>> most IDEs
>>>>>>>> >>> >> would
>>>>>>>> >>> >>             behave better with a dependency jar unless you
>>>>>>>> delegate
>>>>>>>> >>> >> all the
>>>>>>>> >>> >>             build/test actions to the build system and then
>>>>>>>> it won't
>>>>>>>> >>> >> matter.
>>>>>>>> >>> >>
>>>>>>>> >>> >>             On Mon, Dec 11, 2017 at 9:05 PM, Kenneth Knowles
>>>>>>>> >>> >> <k...@google.com
>>>>>>>> >>> >>             <mailto:k...@google.com>> wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 There's also, with additional overhead,
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 1a) A relocated and shipped package for each
>>>>>>>> thing we
>>>>>>>> >>> >> want to
>>>>>>>> >>> >>                 relocate. I think this has also been tried
>>>>>>>> outside
>>>>>>>> >>> >> Beam...
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 Pros:
>>>>>>>> >>> >>                 * all the pros of 1) plus no bloat beyond
>>>>>>>> what is
>>>>>>>> >>> >> necessary
>>>>>>>> >>> >>                 Cons:
>>>>>>>> >>> >>                 * abandons whitelist approach for public
>>>>>>>> deps,
>>>>>>>> >>> >> reverting to
>>>>>>>> >>> >>                 blacklist approach for trouble things like
>>>>>>>> guava, so a
>>>>>>>> >>> >> bit
>>>>>>>> >>> >>                 less principled
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 For both 1) and 1a) I would add:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 Pros:
>>>>>>>> >>> >>                 * clearly readable dependency since code
>>>>>>>> will `import
>>>>>>>> >>> >>                 org.apache.beam.private.guava21` and IDEs
>>>>>>>> will
>>>>>>>> >>> >> understand it
>>>>>>>> >>> >>                 is a distinct lilbrary
>>>>>>>> >>> >>                 * can run tests on unpackaged classes, as
>>>>>>>> long as deps
>>>>>>>> >>> >> are
>>>>>>>> >>> >>                 shaded or provided as jars
>>>>>>>> >>> >>                 * no mysterious action at a distance from
>>>>>>>> inherited
>>>>>>>> >>> >> configuration
>>>>>>>> >>> >>                 Cons:
>>>>>>>> >>> >>                 * need to adjust imports
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 Kenn
>>>>>>>> >>> >>
>>>>>>>> >>> >>                 On Mon, Dec 11, 2017 at 9:57 AM, Lukasz Cwik
>>>>>>>> >>> >> <lc...@google.com
>>>>>>>> >>> >>                 <mailto:lc...@google.com>> wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                     I would suggest that either we use:
>>>>>>>> >>> >>                     1) A common deps package containing
>>>>>>>> shaded
>>>>>>>> >>> >> dependencies
>>>>>>>> >>> >>                     allows for
>>>>>>>> >>> >>                     Pros
>>>>>>>> >>> >>                     * doesn't require the user to build an
>>>>>>>> uber jar
>>>>>>>> >>> >>                     Risks
>>>>>>>> >>> >>                     * dependencies package will keep growing
>>>>>>>> even if
>>>>>>>> >>> >> something
>>>>>>>> >>> >>                     is or isn't needed by all of Apache Beam
>>>>>>>> leading
>>>>>>>> >>> >> to a
>>>>>>>> >>> >>                     large jar anyways negating any space
>>>>>>>> savings
>>>>>>>> >>> >>
>>>>>>>> >>> >>                     2) Shade within each module to a common
>>>>>>>> location
>>>>>>>> >>> >> like
>>>>>>>> >>> >>                     org.apache.beam.relocated.guava....
>>>>>>>> >>> >>                     Pros
>>>>>>>> >>> >>                     * you only get the shaded dependencies
>>>>>>>> of the
>>>>>>>> >>> >> things that
>>>>>>>> >>> >>                     are required
>>>>>>>> >>> >>                     * its one less dependency for users to
>>>>>>>> manage
>>>>>>>> >>> >>                     Risks
>>>>>>>> >>> >>                     * requires an uber jar to be built to
>>>>>>>> get the
>>>>>>>> >>> >> space
>>>>>>>> >>> >>                     savings (either by a user or a
>>>>>>>> distribution of
>>>>>>>> >>> >> Apache
>>>>>>>> >>> >>                     Beam) otherwise we negate any space
>>>>>>>> savings.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                     If we either use a common relocation
>>>>>>>> scheme or a
>>>>>>>> >>> >>                     dependencies jar then each relocation
>>>>>>>> should
>>>>>>>> >>> >> specifically
>>>>>>>> >>> >>                     contain the version number of the
>>>>>>>> package because
>>>>>>>> >>> >> we would
>>>>>>>> >>> >>                     like to allow for us to be using two
>>>>>>>> different
>>>>>>>> >>> >> versions of
>>>>>>>> >>> >>                     the same library.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                     For the common deps package approach,
>>>>>>>> should we
>>>>>>>> >>> >> check in
>>>>>>>> >>> >>                     code where the imports contain the
>>>>>>>> relocated
>>>>>>>> >>> >> location
>>>>>>>> >>> >>                     (e.g. import
>>>>>>>> org.apache.beam.guava.20.0.com
>>>>>>>> >>> >>
>>>>>>>> >>> >> <http://org.apache.beam.guava.20.0.com/>.google.common.colle
>>>>>>>> ct.ImmutableList)?
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>                     On Mon, Dec 11, 2017 at 8:47 AM,
>>>>>>>> Jean-Baptiste
>>>>>>>> >>> >> Onofré
>>>>>>>> >>> >>                     <j...@nanthrax.net 
>>>>>>>> >>> >> <mailto:j...@nanthrax.net>>
>>>>>>>> wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                         Thanks for bringing that back.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                         Indeed guava is shaded in different
>>>>>>>> uber-jar.
>>>>>>>> >>> >> Maybe we
>>>>>>>> >>> >>                         can have a common deps module that
>>>>>>>> we include
>>>>>>>> >>> >> once
>>>>>>>> >>> >>                         (but the user will have to
>>>>>>>> explicitly define
>>>>>>>> >>> >> the dep) ?
>>>>>>>> >>> >>
>>>>>>>> >>> >>                         Basically, what do you propose for
>>>>>>>> protobuf
>>>>>>>> >>> >>                         (unfortunately, I don't see an
>>>>>>>> obvious) ?
>>>>>>>> >>> >>
>>>>>>>> >>> >>                         Regards
>>>>>>>> >>> >>                         JB
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>                         On 12/11/2017 05:35 PM, Ismaël Mejía
>>>>>>>> wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             Hello, I wanted to bring back
>>>>>>>> this subject
>>>>>>>> >>> >> because
>>>>>>>> >>> >>                             I think we should
>>>>>>>> >>> >>                             take action on this and at least
>>>>>>>> first
>>>>>>>> >>> >> have a
>>>>>>>> >>> >>                             shaded version of guava.
>>>>>>>> >>> >>                             I was playing with a toy project
>>>>>>>> and I did
>>>>>>>> >>> >> the
>>>>>>>> >>> >>                             procedure we use to
>>>>>>>> >>> >>                             submit jars to a Hadoop cluster
>>>>>>>> via
>>>>>>>> >>> >> Flink/Spark
>>>>>>>> >>> >>                             which involves
>>>>>>>> >>> >>                             creating an uber jar and I
>>>>>>>> realized that
>>>>>>>> >>> >> the size
>>>>>>>> >>> >>                             of the jar was way
>>>>>>>> >>> >>                             bigger than I expected, and the
>>>>>>>> fact that
>>>>>>>> >>> >> we shade
>>>>>>>> >>> >>                             guava in every
>>>>>>>> >>> >>                             module contributes to this. I
>>>>>>>> found guava
>>>>>>>> >>> >> shaded on:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             sdks/java/core
>>>>>>>> >>> >>                             runners/core-construction-java
>>>>>>>> >>> >>                             runners/core-java
>>>>>>>> >>> >>                             model/job-management
>>>>>>>> >>> >>                             runners/spark
>>>>>>>> >>> >>                             sdks/java/io/hadoop-file-system
>>>>>>>> >>> >>                             sdks/java/io/kafka
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             This means at least 6 times more
>>>>>>>> of the
>>>>>>>> >>> >> size it
>>>>>>>> >>> >>                             should which counts in
>>>>>>>> >>> >>                             around 15MB more (2.4MB*6 deps)
>>>>>>>> of extra
>>>>>>>> >>> >> weight
>>>>>>>> >>> >>                             that we can simply
>>>>>>>> >>> >>                             reduce by using a shaded version
>>>>>>>> of the
>>>>>>>> >>> >> library.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             I add this point to the previous
>>>>>>>> ones
>>>>>>>> >>> >> mentioned
>>>>>>>> >>> >>                             during the discussion
>>>>>>>> >>> >>                             because this goes against the
>>>>>>>> end user
>>>>>>>> >>> >> experience
>>>>>>>> >>> >>                             and affects us all
>>>>>>>> >>> >>                             (devs/users).
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             Another question is if we should
>>>>>>>> shade
>>>>>>>> >>> >> (and how)
>>>>>>>> >>> >>                             protocol buffers
>>>>>>>> >>> >>                             because now with the portability
>>>>>>>> work we
>>>>>>>> >>> >> are
>>>>>>>> >>> >>                             exposing it closer to the
>>>>>>>> >>> >>                             end users. I say this because I
>>>>>>>> also found
>>>>>>>> >>> >> an
>>>>>>>> >>> >>                             issue while running a
>>>>>>>> >>> >>                             job on YARN with the spark
>>>>>>>> runner because
>>>>>>>> >>> >>                             hadoop-common includes
>>>>>>>> >>> >>                             protobuf-java 2 and I had to
>>>>>>>> explicitly
>>>>>>>> >>> >> provide
>>>>>>>> >>> >>                             protocol-buffers 3 as
>>>>>>>> >>> >>                             a dependency to be able to use
>>>>>>>> triggers
>>>>>>>> >>> >> (note the
>>>>>>>> >>> >>                             Spark runner
>>>>>>>> >>> >>                             translates them using some
>>>>>>>> method from
>>>>>>>> >>> >>                             runners/core-java). Since
>>>>>>>> >>> >>                             hadoop-common is provided in the
>>>>>>>> cluster
>>>>>>>> >>> >> with the
>>>>>>>> >>> >>                             older version of
>>>>>>>> >>> >>                             protobuf, I am afraid that this
>>>>>>>> will bite
>>>>>>>> >>> >> us in
>>>>>>>> >>> >>                             the future.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             Ismaël
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             ps. There is already a JIRA for
>>>>>>>> that
>>>>>>>> >>> >> shading for
>>>>>>>> >>> >>                             protobuf on
>>>>>>>> >>> >>                             hadoop-common but this is not
>>>>>>>> coming until
>>>>>>>> >>> >> version
>>>>>>>> >>> >>                             3 is out.
>>>>>>>> >>> >>
>>>>>>>> >>> >> https://issues.apache.org/jira/browse/HADOOP-13136
>>>>>>>> >>> >>
>>>>>>>> >>> >> <https://issues.apache.org/jira/browse/HADOOP-13136>
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             ps2. Extra curious situation is
>>>>>>>> to see
>>>>>>>> >>> >> that the
>>>>>>>> >>> >>                             dataflow-runner ends
>>>>>>>> >>> >>                             up having guava shaded twice via
>>>>>>>> its
>>>>>>>> >>> >> shaded version on
>>>>>>>> >>> >>                             core-construction-java.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             ps3. Of course this message
>>>>>>>> means a
>>>>>>>> >>> >> de-facto +1 at
>>>>>>>> >>> >>                             least to do it for
>>>>>>>> >>> >>                             guava and evaluate it for other
>>>>>>>> libs.
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>                             On Tue, Oct 17, 2017 at 7:29 PM,
>>>>>>>> Lukasz
>>>>>>>> >>> >> Cwik
>>>>>>>> >>> >>                             <lc...@google.com.invalid
>>>>>>>> >>> >>                             <mailto:lc...@google.com.invalid>>
>>>>>>>> wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                 An issue to call out is how
>>>>>>>> to deal
>>>>>>>> >>> >> with our
>>>>>>>> >>> >>                                 generated code (.avro and
>>>>>>>> >>> >>                                 .proto) as I don't believe
>>>>>>>> those
>>>>>>>> >>> >> plugins allow
>>>>>>>> >>> >>                                 you to generate code using a
>>>>>>>> >>> >>                                 shaded package prefix on
>>>>>>>> imports.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                 On Tue, Oct 17, 2017 at
>>>>>>>> 10:28 AM,
>>>>>>>> >>> >> Thomas Groh
>>>>>>>> >>> >>                                 <tg...@google.com.invalid
>>>>>>>> >>> >>                                 <mailto:
>>>>>>>> tg...@google.com.invalid>>
>>>>>>>> >>> >>                                 wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     +1 to the goal. I'm
>>>>>>>> hugely in
>>>>>>>> >>> >> favor of not
>>>>>>>> >>> >>                                     doing the same shading
>>>>>>>> work
>>>>>>>> >>> >>                                     every time for
>>>>>>>> dependencies we
>>>>>>>> >>> >> know we'll use.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     This also means that if
>>>>>>>> we end up
>>>>>>>> >>> >> pulling
>>>>>>>> >>> >>                                     in transitive
>>>>>>>> dependencies we
>>>>>>>> >>> >>                                     don't want in any
>>>>>>>> particular
>>>>>>>> >>> >> module we can
>>>>>>>> >>> >>                                     avoid having to adjust
>>>>>>>> our
>>>>>>>> >>> >>                                     repackaging strategy for
>>>>>>>> that
>>>>>>>> >>> >> module -
>>>>>>>> >>> >>                                     which I have run into
>>>>>>>> face-first
>>>>>>>> >>> >> in
>>>>>>>> >>> >>                                     the past.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     On Tue, Oct 17, 2017 at
>>>>>>>> 9:48 AM,
>>>>>>>> >>> >> Kenneth
>>>>>>>> >>> >>                                     Knowles
>>>>>>>> <k...@google.com.invalid
>>>>>>>> >>> >>                                     <mailto:
>>>>>>>> k...@google.com.invalid>>
>>>>>>>> >>> >>                                     wrote:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Hi all,
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Shading is a big
>>>>>>>> part of how
>>>>>>>> >>> >> we keep
>>>>>>>> >>> >>                                         our dependencies
>>>>>>>> sane in Beam.
>>>>>>>> >>> >> But
>>>>>>>> >>> >>                                         downsides: shading
>>>>>>>> is super
>>>>>>>> >>> >> slow,
>>>>>>>> >>> >>                                         causes massive jar
>>>>>>>> bloat, and
>>>>>>>> >>> >> kind of
>>>>>>>> >>> >>                                         hard to get right
>>>>>>>> because
>>>>>>>> >>> >> artifacts
>>>>>>>> >>> >>                                         and namespaces are
>>>>>>>> not 1-to-1.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         I know that some
>>>>>>>> communities
>>>>>>>> >>> >>                                         distribute their own
>>>>>>>> shaded
>>>>>>>> >>> >>                                         distributions of
>>>>>>>> >>> >>                                         dependencies. I had
>>>>>>>> a thought
>>>>>>>> >>> >> about
>>>>>>>> >>> >>                                         doing something
>>>>>>>> similar that I
>>>>>>>> >>> >> wanted
>>>>>>>> >>> >>                                         to throw out there
>>>>>>>> for people
>>>>>>>> >>> >> to poke
>>>>>>>> >>> >>                                         holes in.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         To set the scene,
>>>>>>>> here is how
>>>>>>>> >>> >> I view
>>>>>>>> >>> >>                                         shading:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                           - A module has
>>>>>>>> public
>>>>>>>> >>> >> dependencies
>>>>>>>> >>> >>                                         and private
>>>>>>>> dependencies.
>>>>>>>> >>> >>                                           - Public deps are
>>>>>>>> used for
>>>>>>>> >>> >> data
>>>>>>>> >>> >>                                         interchange; users
>>>>>>>> must share
>>>>>>>> >>> >> these
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     deps.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                           - Private deps are
>>>>>>>> just
>>>>>>>> >>> >>                                         functionality and
>>>>>>>> can be
>>>>>>>> >>> >> hidden (in
>>>>>>>> >>> >>                                         our case,
>>>>>>>> >>> >>                                         relocated + bundled)
>>>>>>>> >>> >>                                           - It isn't
>>>>>>>> necessarily that
>>>>>>>> >>> >> simple,
>>>>>>>> >>> >>                                         because public and
>>>>>>>> private
>>>>>>>> >>> >> deps
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     might
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         interact in
>>>>>>>> higher-order ways
>>>>>>>> >>> >>                                         ("public" is
>>>>>>>> contagious)
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Shading is an
>>>>>>>> implementation
>>>>>>>> >>> >> detail of
>>>>>>>> >>> >>                                         expressing these
>>>>>>>> >>> >> characteristics.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     We
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         use shading
>>>>>>>> selectively
>>>>>>>> >>> >> because of its
>>>>>>>> >>> >>                                         downsides I
>>>>>>>> mentioned above.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         But what about this
>>>>>>>> idea:
>>>>>>>> >>> >> Introduce
>>>>>>>> >>> >>                                         shaded deps as a
>>>>>>>> single
>>>>>>>> >>> >> separate
>>>>>>>> >>> >>                                         artifact.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                           -
>>>>>>>> sdks/java/private-deps:
>>>>>>>> >>> >> bundled
>>>>>>>> >>> >>                                         uber jar with
>>>>>>>> relocated
>>>>>>>> >>> >> versions of
>>>>>>>> >>> >>                                         everything we want
>>>>>>>> to shade
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                           - sdks/java/core
>>>>>>>> and
>>>>>>>> >>> >>                                         sdks/java/harness: no
>>>>>>>> >>> >> relocation or
>>>>>>>> >>> >>                                         bundling -
>>>>>>>> >>> >>                                         depends on
>>>>>>>> >>> >>
>>>>>>>>  `beam-sdks-java-private-deps`
>>>>>>>> >>> >> and
>>>>>>>> >>> >>                                         imports like
>>>>>>>> >>> >>
>>>>>>>> >>> >> `org.apache.beam.sdk.private.com.google.common`
>>>>>>>> >>> >>                                         directly (this is
>>>>>>>> what
>>>>>>>> >>> >>                                         they
>>>>>>>> >>> >>                                         are rewritten to
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Some benefits
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                           - much faster
>>>>>>>> builds of
>>>>>>>> >>> >> other modules
>>>>>>>> >>> >>                                           - only one shaded
>>>>>>>> uber jar
>>>>>>>> >>> >>                                           - rare/no rebuilds
>>>>>>>> of the
>>>>>>>> >>> >> uber jar
>>>>>>>> >>> >>                                           - can use maven
>>>>>>>> enforcer to
>>>>>>>> >>> >> forbid
>>>>>>>> >>> >>                                         imports like
>>>>>>>> com.google.common
>>>>>>>> >>> >>                                           - configuration
>>>>>>>> all in one
>>>>>>>> >>> >> place
>>>>>>>> >>> >>                                           - no automated
>>>>>>>> rewriting of
>>>>>>>> >>> >> our real
>>>>>>>> >>> >>                                         code, which has led
>>>>>>>> to some
>>>>>>>> >>> >> major
>>>>>>>> >>> >>                                         confusion
>>>>>>>> >>> >>                                           - easy to implement
>>>>>>>> >>> >> incrementally
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Downsides:
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                           - plenty of effort
>>>>>>>> work to
>>>>>>>> >>> >> get there
>>>>>>>> >>> >>                                           - unclear how many
>>>>>>>> different
>>>>>>>> >>> >> such
>>>>>>>> >>> >>                                         deps modules we
>>>>>>>> need; sharing
>>>>>>>> >>> >> them
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                     could
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         get weird
>>>>>>>> >>> >>                                           - if we hit a
>>>>>>>> roadblock, we
>>>>>>>> >>> >> will
>>>>>>>> >>> >>                                         have committed a lot
>>>>>>>> of time
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Just something I was
>>>>>>>> musing as
>>>>>>>> >>> >> I spent
>>>>>>>> >>> >>                                         another evening
>>>>>>>> waiting for
>>>>>>>> >>> >> slow
>>>>>>>> >>> >>                                         builds to try to
>>>>>>>> confirm
>>>>>>>> >>> >> changes to
>>>>>>>> >>> >>                                         brittle poms.
>>>>>>>> >>> >>
>>>>>>>> >>> >>                                         Kenn
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>                         --
>>>>>>>> >>> >>                         Jean-Baptiste Onofré
>>>>>>>> >>> >>                         jbono...@apache.org
>>>>>>>> >>> >> <mailto:jbono...@apache.org>
>>>>>>>> >>> >>                         http://blog.nanthrax.net
>>>>>>>> >>> >> <http://blog.nanthrax.net/>
>>>>>>>> >>> >>                         Talend - http://www.talend.com
>>>>>>>> >>> >> <http://www.talend.com/>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >>
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>> >
>>>>>>>> >>>
>>>>>>>> >>> --
>>>>>>>> >>> Jean-Baptiste Onofré
>>>>>>>> >>> jbono...@apache.org
>>>>>>>> >>> http://blog.nanthrax.net
>>>>>>>> >>> Talend - http://www.talend.com
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [DISCUSS] [Java] Private shaded dependency uber jars

Reply via email to