TypeToken is not trivial. I've written code to do what TypeToken does before (figuring out generic ancestor types). It's actually somewhat tricky, and the code I wrote had subtle bugs in it; eventually we removed this code in favor of Guava's implementation :)
On Fri, Feb 2, 2018 at 7:47 AM, Romain Manni-Bucau <rmannibu...@gmail.com> wrote: > Yep, note I never said to reinvent the wheel, we can copy it from guava, > openwebbeans or any other impl. Point was more to avoid to depend on > something we don't own for that which is after all not that much code. I > also think we can limit it a lot to align it on what is supported by beam > (I'm thinking to coders here) but this can be another topic. > > > Romain Manni-Bucau > @rmannibucau <https://twitter.com/rmannibucau> | Blog > <https://rmannibucau.metawerx.net/> | Old Blog > <http://rmannibucau.wordpress.com> | Github > <https://github.com/rmannibucau> | LinkedIn > <https://www.linkedin.com/in/rmannibucau> | Book > <https://www.packtpub.com/application-development/java-ee-8-high-performance> > > 2018-02-02 16:33 GMT+01:00 Kenneth Knowles <k...@google.com>: > >> On Fri, Feb 2, 2018 at 7:18 AM, Romain Manni-Bucau <rmannibu...@gmail.com >> > wrote: >> >>> Don't forget beam doesn't support much behind it (mainly only a few >>> ParameterizedType due the usage code path) so it is mainly only about >>> handling parameterized types and typevariables recursively. Not a lot of >>> work. Main concern being it is in the API so using a shade as an API is >>> never a good idea, in particular cause the shade can be broken and requires >>> to setup clirr or things like that and when it breaks you are done and need >>> to fork it anyway. Limiting the dependencies for an API - as beam is - is >>> always saner even if it requires a small fork of code. >>> >> >> The main thing that TypeToken is used for is capturing generics that are >> lost by Java reflection. It is a bit tricky, actually. >> >> Kenn >> >> >> >>> >>> Romain Manni-Bucau >>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>> <https://rmannibucau.metawerx.net/> | Old Blog >>> <http://rmannibucau.wordpress.com> | Github >>> <https://github.com/rmannibucau> | LinkedIn >>> <https://www.linkedin.com/in/rmannibucau> | Book >>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>> >>> 2018-02-02 15:49 GMT+01:00 Kenneth Knowles <k...@google.com>: >>> >>>> On Fri, Feb 2, 2018 at 6:41 AM, Romain Manni-Bucau < >>>> rmannibu...@gmail.com> wrote: >>>> >>>>> >>>>> >>>>> 2018-02-02 15:37 GMT+01:00 Kenneth Knowles <k...@google.com>: >>>>> >>>>>> Another couple: >>>>>> >>>>>> - User-facing TypeDescriptor is a very thin wrapper on Guava's >>>>>> TypeToken >>>>>> >>>>> >>>>> Technically reflect Type is enough >>>>> >>>> >>>> If you want to try to remove TypeToken from underneath TypeDescriptor, >>>> I have no objections as long as you expand the test suite to cover all the >>>> functionality where we trust TypeToken's tests. Good luck :-) >>>> >>>> Kenn >>>> >>>> >>>>> >>>>> >>>>>> - ImmutableList and friends and their builders are very widely used >>>>>> and IMO still add a lot for readability and preventing someone coming >>>>>> along >>>>>> and adding mistakes to a codebase >>>>>> >>>>> >>>>> Sugar but not required. When you compare the cost of a shade versus of >>>>> duplicating the parts we need there is no real match IMHO. >>>>> >>>>> >>>>>> >>>>>> So considering it all, I would keep a vendored Guava (but also move >>>>>> off where we can, and also have our own improvements). I hope it will be >>>>>> a >>>>>> near-empty build file to generate it so not a maintenance burden. >>>>>> >>>>>> Kenn >>>>>> >>>>>> On Thu, Feb 1, 2018 at 8:44 PM, Kenneth Knowles <k...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Nice. This sounds like a great idea (or two?) and goes along with >>>>>>> what I just started for futures. >>>>>>> >>>>>>> Guava: filed https://issues.apache.org/jira/browse/BEAM-3606 and >>>>>>> assigned to Ismaël :-) and converted my futures thing to a subtask. >>>>>>> >>>>>>> Specific things for our micro guava: >>>>>>> >>>>>>> - checkArgumentNotNull can throw the right exception >>>>>>> - our own Optional because Java's is not Serializable >>>>>>> - futures combinators since many are missing, especially Java's >>>>>>> don't do exceptions right >>>>>>> >>>>>>> Protobuf: didn't file an issue because I'm not sure >>>>>>> >>>>>>> I was wondering if pre-shading works. We really need to get rid of >>>>>>> it from public APIs in a 100% reliable way. It is also a problem for >>>>>>> Dataflow. I was wondering if one approach is to pre-shade >>>>>>> gcpio-protobuf-java, gcpio-protobuf-java-util, gcpio-grpc-java, etc. >>>>>>> Anything that needs to take a Message object. (and do the same for >>>>>>> beam-model-protobuf-java since the model bits have to depend on each >>>>>>> other >>>>>>> but nothing else). >>>>>>> >>>>>>> Kenn >>>>>>> >>>>>>> On Thu, Feb 1, 2018 at 1:56 AM, Ismaël Mejía <ieme...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Huge +1 to get rid of Guava! >>>>>>>> >>>>>>>> This solves annoying dependency issues for some IOs and allow us to >>>>>>>> get rid of the shading that makes current jars bigger than they >>>>>>>> should >>>>>>>> be. >>>>>>>> >>>>>>>> We can create our own 'micro guava' package with some classes for >>>>>>>> things that are hard to migrate, or that we prefer to still have >>>>>>>> like >>>>>>>> the check* methods for example. Given the size of the task we should >>>>>>>> probably divide it into subtasks, more important is to get rid of it >>>>>>>> for 'sdks/java/core'. We can then attack other areas afterwards. >>>>>>>> >>>>>>>> Other important idea would be to get rid of Protobuf in public APIs >>>>>>>> like GCPIO and to better shade it from leaking into the runners. An >>>>>>>> unexpected side effect of this is a leak of netty via gRPC/protobuf >>>>>>>> that is byting us for the Spark runner, but well that's worth a >>>>>>>> different discussion. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 1, 2018 at 10:08 AM, Romain Manni-Bucau >>>>>>>> <rmannibu...@gmail.com> wrote: >>>>>>>> > a map of list is fine and not a challenge we'll face long I hope >>>>>>>> ;) >>>>>>>> > >>>>>>>> > >>>>>>>> > Romain Manni-Bucau >>>>>>>> > @rmannibucau | Blog | Old Blog | Github | LinkedIn >>>>>>>> > >>>>>>>> > 2018-02-01 9:40 GMT+01:00 Reuven Lax <re...@google.com>: >>>>>>>> >> >>>>>>>> >> Not sure we'll be able to replace them all. Things like guava >>>>>>>> Table and >>>>>>>> >> Multimap don't have great replacements in Java8. >>>>>>>> >> >>>>>>>> >> On Wed, Jan 31, 2018 at 10:11 PM, Jean-Baptiste Onofré < >>>>>>>> j...@nanthrax.net> >>>>>>>> >> wrote: >>>>>>>> >>> >>>>>>>> >>> +1, it was on my TODO for a while waiting the Java8 update. >>>>>>>> >>> >>>>>>>> >>> Regards >>>>>>>> >>> JB >>>>>>>> >>> >>>>>>>> >>> On 02/01/2018 06:56 AM, Romain Manni-Bucau wrote: >>>>>>>> >>> > Why not dropping guava for all beam codebase? With java 8 it >>>>>>>> is quite >>>>>>>> >>> > easy to do >>>>>>>> >>> > it and avoid a bunch of conflicts. Did it in 2 projects with >>>>>>>> quite a >>>>>>>> >>> > good result. >>>>>>>> >>> > >>>>>>>> >>> > Le 1 févr. 2018 06:50, "Lukasz Cwik" <lc...@google.com >>>>>>>> >>> > <mailto:lc...@google.com>> a écrit : >>>>>>>> >>> > >>>>>>>> >>> > Make sure to include the guava version in the artifact >>>>>>>> name so that >>>>>>>> >>> > we can >>>>>>>> >>> > have multiple vendored versions. >>>>>>>> >>> > >>>>>>>> >>> > On Wed, Jan 31, 2018 at 9:16 PM, Kenneth Knowles < >>>>>>>> k...@google.com >>>>>>>> >>> > <mailto:k...@google.com>> wrote: >>>>>>>> >>> > >>>>>>>> >>> > I didn't have time for this, but it just bit me. We >>>>>>>> definitely >>>>>>>> >>> > have >>>>>>>> >>> > Guava on the API surface of runner support code in >>>>>>>> ways that >>>>>>>> >>> > get >>>>>>>> >>> > incompatibly shaded. I will probably start "1a" by >>>>>>>> making a >>>>>>>> >>> > shaded >>>>>>>> >>> > library org.apache.beam:vendored-guava and starting >>>>>>>> to use it. >>>>>>>> >>> > It sounds >>>>>>>> >>> > like there is generally unanimous support for that >>>>>>>> much, >>>>>>>> >>> > anyhow. >>>>>>>> >>> > >>>>>>>> >>> > Kenn >>>>>>>> >>> > >>>>>>>> >>> > On Wed, Dec 13, 2017 at 8:31 AM, Aljoscha Krettek >>>>>>>> >>> > <aljos...@apache.org >>>>>>>> >>> > <mailto:aljos...@apache.org>> wrote: >>>>>>>> >>> > >>>>>>>> >>> > Thanks Ismaël for bringing up this discussion >>>>>>>> again! >>>>>>>> >>> > >>>>>>>> >>> > I would be in favour of 1) and more specifically >>>>>>>> of 1a) >>>>>>>> >>> > >>>>>>>> >>> > Aljoscha >>>>>>>> >>> > >>>>>>>> >>> > >>>>>>>> >>> >> On 12. Dec 2017, at 18:56, Lukasz Cwik < >>>>>>>> lc...@google.com >>>>>>>> >>> >> <mailto:lc...@google.com>> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> You can always run tests on post shaded >>>>>>>> artifacts instead >>>>>>>> >>> >> of the >>>>>>>> >>> >> compiled classes, it just requires us to change >>>>>>>> our maven >>>>>>>> >>> >> surefire >>>>>>>> >>> >> / gradle test configurations but it is true that >>>>>>>> most IDEs >>>>>>>> >>> >> would >>>>>>>> >>> >> behave better with a dependency jar unless you >>>>>>>> delegate >>>>>>>> >>> >> all the >>>>>>>> >>> >> build/test actions to the build system and then >>>>>>>> it won't >>>>>>>> >>> >> matter. >>>>>>>> >>> >> >>>>>>>> >>> >> On Mon, Dec 11, 2017 at 9:05 PM, Kenneth Knowles >>>>>>>> >>> >> <k...@google.com >>>>>>>> >>> >> <mailto:k...@google.com>> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> There's also, with additional overhead, >>>>>>>> >>> >> >>>>>>>> >>> >> 1a) A relocated and shipped package for each >>>>>>>> thing we >>>>>>>> >>> >> want to >>>>>>>> >>> >> relocate. I think this has also been tried >>>>>>>> outside >>>>>>>> >>> >> Beam... >>>>>>>> >>> >> >>>>>>>> >>> >> Pros: >>>>>>>> >>> >> * all the pros of 1) plus no bloat beyond >>>>>>>> what is >>>>>>>> >>> >> necessary >>>>>>>> >>> >> Cons: >>>>>>>> >>> >> * abandons whitelist approach for public >>>>>>>> deps, >>>>>>>> >>> >> reverting to >>>>>>>> >>> >> blacklist approach for trouble things like >>>>>>>> guava, so a >>>>>>>> >>> >> bit >>>>>>>> >>> >> less principled >>>>>>>> >>> >> >>>>>>>> >>> >> For both 1) and 1a) I would add: >>>>>>>> >>> >> >>>>>>>> >>> >> Pros: >>>>>>>> >>> >> * clearly readable dependency since code >>>>>>>> will `import >>>>>>>> >>> >> org.apache.beam.private.guava21` and IDEs >>>>>>>> will >>>>>>>> >>> >> understand it >>>>>>>> >>> >> is a distinct lilbrary >>>>>>>> >>> >> * can run tests on unpackaged classes, as >>>>>>>> long as deps >>>>>>>> >>> >> are >>>>>>>> >>> >> shaded or provided as jars >>>>>>>> >>> >> * no mysterious action at a distance from >>>>>>>> inherited >>>>>>>> >>> >> configuration >>>>>>>> >>> >> Cons: >>>>>>>> >>> >> * need to adjust imports >>>>>>>> >>> >> >>>>>>>> >>> >> Kenn >>>>>>>> >>> >> >>>>>>>> >>> >> On Mon, Dec 11, 2017 at 9:57 AM, Lukasz Cwik >>>>>>>> >>> >> <lc...@google.com >>>>>>>> >>> >> <mailto:lc...@google.com>> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> I would suggest that either we use: >>>>>>>> >>> >> 1) A common deps package containing >>>>>>>> shaded >>>>>>>> >>> >> dependencies >>>>>>>> >>> >> allows for >>>>>>>> >>> >> Pros >>>>>>>> >>> >> * doesn't require the user to build an >>>>>>>> uber jar >>>>>>>> >>> >> Risks >>>>>>>> >>> >> * dependencies package will keep growing >>>>>>>> even if >>>>>>>> >>> >> something >>>>>>>> >>> >> is or isn't needed by all of Apache Beam >>>>>>>> leading >>>>>>>> >>> >> to a >>>>>>>> >>> >> large jar anyways negating any space >>>>>>>> savings >>>>>>>> >>> >> >>>>>>>> >>> >> 2) Shade within each module to a common >>>>>>>> location >>>>>>>> >>> >> like >>>>>>>> >>> >> org.apache.beam.relocated.guava.... >>>>>>>> >>> >> Pros >>>>>>>> >>> >> * you only get the shaded dependencies >>>>>>>> of the >>>>>>>> >>> >> things that >>>>>>>> >>> >> are required >>>>>>>> >>> >> * its one less dependency for users to >>>>>>>> manage >>>>>>>> >>> >> Risks >>>>>>>> >>> >> * requires an uber jar to be built to >>>>>>>> get the >>>>>>>> >>> >> space >>>>>>>> >>> >> savings (either by a user or a >>>>>>>> distribution of >>>>>>>> >>> >> Apache >>>>>>>> >>> >> Beam) otherwise we negate any space >>>>>>>> savings. >>>>>>>> >>> >> >>>>>>>> >>> >> If we either use a common relocation >>>>>>>> scheme or a >>>>>>>> >>> >> dependencies jar then each relocation >>>>>>>> should >>>>>>>> >>> >> specifically >>>>>>>> >>> >> contain the version number of the >>>>>>>> package because >>>>>>>> >>> >> we would >>>>>>>> >>> >> like to allow for us to be using two >>>>>>>> different >>>>>>>> >>> >> versions of >>>>>>>> >>> >> the same library. >>>>>>>> >>> >> >>>>>>>> >>> >> For the common deps package approach, >>>>>>>> should we >>>>>>>> >>> >> check in >>>>>>>> >>> >> code where the imports contain the >>>>>>>> relocated >>>>>>>> >>> >> location >>>>>>>> >>> >> (e.g. import >>>>>>>> org.apache.beam.guava.20.0.com >>>>>>>> >>> >> >>>>>>>> >>> >> <http://org.apache.beam.guava.20.0.com/>.google.common.colle >>>>>>>> ct.ImmutableList)? >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> On Mon, Dec 11, 2017 at 8:47 AM, >>>>>>>> Jean-Baptiste >>>>>>>> >>> >> Onofré >>>>>>>> >>> >> <j...@nanthrax.net >>>>>>>> >>> >> <mailto:j...@nanthrax.net>> >>>>>>>> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> Thanks for bringing that back. >>>>>>>> >>> >> >>>>>>>> >>> >> Indeed guava is shaded in different >>>>>>>> uber-jar. >>>>>>>> >>> >> Maybe we >>>>>>>> >>> >> can have a common deps module that >>>>>>>> we include >>>>>>>> >>> >> once >>>>>>>> >>> >> (but the user will have to >>>>>>>> explicitly define >>>>>>>> >>> >> the dep) ? >>>>>>>> >>> >> >>>>>>>> >>> >> Basically, what do you propose for >>>>>>>> protobuf >>>>>>>> >>> >> (unfortunately, I don't see an >>>>>>>> obvious) ? >>>>>>>> >>> >> >>>>>>>> >>> >> Regards >>>>>>>> >>> >> JB >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> On 12/11/2017 05:35 PM, Ismaël Mejía >>>>>>>> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> Hello, I wanted to bring back >>>>>>>> this subject >>>>>>>> >>> >> because >>>>>>>> >>> >> I think we should >>>>>>>> >>> >> take action on this and at least >>>>>>>> first >>>>>>>> >>> >> have a >>>>>>>> >>> >> shaded version of guava. >>>>>>>> >>> >> I was playing with a toy project >>>>>>>> and I did >>>>>>>> >>> >> the >>>>>>>> >>> >> procedure we use to >>>>>>>> >>> >> submit jars to a Hadoop cluster >>>>>>>> via >>>>>>>> >>> >> Flink/Spark >>>>>>>> >>> >> which involves >>>>>>>> >>> >> creating an uber jar and I >>>>>>>> realized that >>>>>>>> >>> >> the size >>>>>>>> >>> >> of the jar was way >>>>>>>> >>> >> bigger than I expected, and the >>>>>>>> fact that >>>>>>>> >>> >> we shade >>>>>>>> >>> >> guava in every >>>>>>>> >>> >> module contributes to this. I >>>>>>>> found guava >>>>>>>> >>> >> shaded on: >>>>>>>> >>> >> >>>>>>>> >>> >> sdks/java/core >>>>>>>> >>> >> runners/core-construction-java >>>>>>>> >>> >> runners/core-java >>>>>>>> >>> >> model/job-management >>>>>>>> >>> >> runners/spark >>>>>>>> >>> >> sdks/java/io/hadoop-file-system >>>>>>>> >>> >> sdks/java/io/kafka >>>>>>>> >>> >> >>>>>>>> >>> >> This means at least 6 times more >>>>>>>> of the >>>>>>>> >>> >> size it >>>>>>>> >>> >> should which counts in >>>>>>>> >>> >> around 15MB more (2.4MB*6 deps) >>>>>>>> of extra >>>>>>>> >>> >> weight >>>>>>>> >>> >> that we can simply >>>>>>>> >>> >> reduce by using a shaded version >>>>>>>> of the >>>>>>>> >>> >> library. >>>>>>>> >>> >> >>>>>>>> >>> >> I add this point to the previous >>>>>>>> ones >>>>>>>> >>> >> mentioned >>>>>>>> >>> >> during the discussion >>>>>>>> >>> >> because this goes against the >>>>>>>> end user >>>>>>>> >>> >> experience >>>>>>>> >>> >> and affects us all >>>>>>>> >>> >> (devs/users). >>>>>>>> >>> >> >>>>>>>> >>> >> Another question is if we should >>>>>>>> shade >>>>>>>> >>> >> (and how) >>>>>>>> >>> >> protocol buffers >>>>>>>> >>> >> because now with the portability >>>>>>>> work we >>>>>>>> >>> >> are >>>>>>>> >>> >> exposing it closer to the >>>>>>>> >>> >> end users. I say this because I >>>>>>>> also found >>>>>>>> >>> >> an >>>>>>>> >>> >> issue while running a >>>>>>>> >>> >> job on YARN with the spark >>>>>>>> runner because >>>>>>>> >>> >> hadoop-common includes >>>>>>>> >>> >> protobuf-java 2 and I had to >>>>>>>> explicitly >>>>>>>> >>> >> provide >>>>>>>> >>> >> protocol-buffers 3 as >>>>>>>> >>> >> a dependency to be able to use >>>>>>>> triggers >>>>>>>> >>> >> (note the >>>>>>>> >>> >> Spark runner >>>>>>>> >>> >> translates them using some >>>>>>>> method from >>>>>>>> >>> >> runners/core-java). Since >>>>>>>> >>> >> hadoop-common is provided in the >>>>>>>> cluster >>>>>>>> >>> >> with the >>>>>>>> >>> >> older version of >>>>>>>> >>> >> protobuf, I am afraid that this >>>>>>>> will bite >>>>>>>> >>> >> us in >>>>>>>> >>> >> the future. >>>>>>>> >>> >> >>>>>>>> >>> >> Ismaël >>>>>>>> >>> >> >>>>>>>> >>> >> ps. There is already a JIRA for >>>>>>>> that >>>>>>>> >>> >> shading for >>>>>>>> >>> >> protobuf on >>>>>>>> >>> >> hadoop-common but this is not >>>>>>>> coming until >>>>>>>> >>> >> version >>>>>>>> >>> >> 3 is out. >>>>>>>> >>> >> >>>>>>>> >>> >> https://issues.apache.org/jira/browse/HADOOP-13136 >>>>>>>> >>> >> >>>>>>>> >>> >> <https://issues.apache.org/jira/browse/HADOOP-13136> >>>>>>>> >>> >> >>>>>>>> >>> >> ps2. Extra curious situation is >>>>>>>> to see >>>>>>>> >>> >> that the >>>>>>>> >>> >> dataflow-runner ends >>>>>>>> >>> >> up having guava shaded twice via >>>>>>>> its >>>>>>>> >>> >> shaded version on >>>>>>>> >>> >> core-construction-java. >>>>>>>> >>> >> >>>>>>>> >>> >> ps3. Of course this message >>>>>>>> means a >>>>>>>> >>> >> de-facto +1 at >>>>>>>> >>> >> least to do it for >>>>>>>> >>> >> guava and evaluate it for other >>>>>>>> libs. >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> On Tue, Oct 17, 2017 at 7:29 PM, >>>>>>>> Lukasz >>>>>>>> >>> >> Cwik >>>>>>>> >>> >> <lc...@google.com.invalid >>>>>>>> >>> >> <mailto:lc...@google.com.invalid>> >>>>>>>> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> An issue to call out is how >>>>>>>> to deal >>>>>>>> >>> >> with our >>>>>>>> >>> >> generated code (.avro and >>>>>>>> >>> >> .proto) as I don't believe >>>>>>>> those >>>>>>>> >>> >> plugins allow >>>>>>>> >>> >> you to generate code using a >>>>>>>> >>> >> shaded package prefix on >>>>>>>> imports. >>>>>>>> >>> >> >>>>>>>> >>> >> On Tue, Oct 17, 2017 at >>>>>>>> 10:28 AM, >>>>>>>> >>> >> Thomas Groh >>>>>>>> >>> >> <tg...@google.com.invalid >>>>>>>> >>> >> <mailto: >>>>>>>> tg...@google.com.invalid>> >>>>>>>> >>> >> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> +1 to the goal. I'm >>>>>>>> hugely in >>>>>>>> >>> >> favor of not >>>>>>>> >>> >> doing the same shading >>>>>>>> work >>>>>>>> >>> >> every time for >>>>>>>> dependencies we >>>>>>>> >>> >> know we'll use. >>>>>>>> >>> >> >>>>>>>> >>> >> This also means that if >>>>>>>> we end up >>>>>>>> >>> >> pulling >>>>>>>> >>> >> in transitive >>>>>>>> dependencies we >>>>>>>> >>> >> don't want in any >>>>>>>> particular >>>>>>>> >>> >> module we can >>>>>>>> >>> >> avoid having to adjust >>>>>>>> our >>>>>>>> >>> >> repackaging strategy for >>>>>>>> that >>>>>>>> >>> >> module - >>>>>>>> >>> >> which I have run into >>>>>>>> face-first >>>>>>>> >>> >> in >>>>>>>> >>> >> the past. >>>>>>>> >>> >> >>>>>>>> >>> >> On Tue, Oct 17, 2017 at >>>>>>>> 9:48 AM, >>>>>>>> >>> >> Kenneth >>>>>>>> >>> >> Knowles >>>>>>>> <k...@google.com.invalid >>>>>>>> >>> >> <mailto: >>>>>>>> k...@google.com.invalid>> >>>>>>>> >>> >> wrote: >>>>>>>> >>> >> >>>>>>>> >>> >> Hi all, >>>>>>>> >>> >> >>>>>>>> >>> >> Shading is a big >>>>>>>> part of how >>>>>>>> >>> >> we keep >>>>>>>> >>> >> our dependencies >>>>>>>> sane in Beam. >>>>>>>> >>> >> But >>>>>>>> >>> >> downsides: shading >>>>>>>> is super >>>>>>>> >>> >> slow, >>>>>>>> >>> >> causes massive jar >>>>>>>> bloat, and >>>>>>>> >>> >> kind of >>>>>>>> >>> >> hard to get right >>>>>>>> because >>>>>>>> >>> >> artifacts >>>>>>>> >>> >> and namespaces are >>>>>>>> not 1-to-1. >>>>>>>> >>> >> >>>>>>>> >>> >> I know that some >>>>>>>> communities >>>>>>>> >>> >> distribute their own >>>>>>>> shaded >>>>>>>> >>> >> distributions of >>>>>>>> >>> >> dependencies. I had >>>>>>>> a thought >>>>>>>> >>> >> about >>>>>>>> >>> >> doing something >>>>>>>> similar that I >>>>>>>> >>> >> wanted >>>>>>>> >>> >> to throw out there >>>>>>>> for people >>>>>>>> >>> >> to poke >>>>>>>> >>> >> holes in. >>>>>>>> >>> >> >>>>>>>> >>> >> To set the scene, >>>>>>>> here is how >>>>>>>> >>> >> I view >>>>>>>> >>> >> shading: >>>>>>>> >>> >> >>>>>>>> >>> >> - A module has >>>>>>>> public >>>>>>>> >>> >> dependencies >>>>>>>> >>> >> and private >>>>>>>> dependencies. >>>>>>>> >>> >> - Public deps are >>>>>>>> used for >>>>>>>> >>> >> data >>>>>>>> >>> >> interchange; users >>>>>>>> must share >>>>>>>> >>> >> these >>>>>>>> >>> >> >>>>>>>> >>> >> deps. >>>>>>>> >>> >> >>>>>>>> >>> >> - Private deps are >>>>>>>> just >>>>>>>> >>> >> functionality and >>>>>>>> can be >>>>>>>> >>> >> hidden (in >>>>>>>> >>> >> our case, >>>>>>>> >>> >> relocated + bundled) >>>>>>>> >>> >> - It isn't >>>>>>>> necessarily that >>>>>>>> >>> >> simple, >>>>>>>> >>> >> because public and >>>>>>>> private >>>>>>>> >>> >> deps >>>>>>>> >>> >> >>>>>>>> >>> >> might >>>>>>>> >>> >> >>>>>>>> >>> >> interact in >>>>>>>> higher-order ways >>>>>>>> >>> >> ("public" is >>>>>>>> contagious) >>>>>>>> >>> >> >>>>>>>> >>> >> Shading is an >>>>>>>> implementation >>>>>>>> >>> >> detail of >>>>>>>> >>> >> expressing these >>>>>>>> >>> >> characteristics. >>>>>>>> >>> >> >>>>>>>> >>> >> We >>>>>>>> >>> >> >>>>>>>> >>> >> use shading >>>>>>>> selectively >>>>>>>> >>> >> because of its >>>>>>>> >>> >> downsides I >>>>>>>> mentioned above. >>>>>>>> >>> >> >>>>>>>> >>> >> But what about this >>>>>>>> idea: >>>>>>>> >>> >> Introduce >>>>>>>> >>> >> shaded deps as a >>>>>>>> single >>>>>>>> >>> >> separate >>>>>>>> >>> >> artifact. >>>>>>>> >>> >> >>>>>>>> >>> >> - >>>>>>>> sdks/java/private-deps: >>>>>>>> >>> >> bundled >>>>>>>> >>> >> uber jar with >>>>>>>> relocated >>>>>>>> >>> >> versions of >>>>>>>> >>> >> everything we want >>>>>>>> to shade >>>>>>>> >>> >> >>>>>>>> >>> >> - sdks/java/core >>>>>>>> and >>>>>>>> >>> >> sdks/java/harness: no >>>>>>>> >>> >> relocation or >>>>>>>> >>> >> bundling - >>>>>>>> >>> >> depends on >>>>>>>> >>> >> >>>>>>>> `beam-sdks-java-private-deps` >>>>>>>> >>> >> and >>>>>>>> >>> >> imports like >>>>>>>> >>> >> >>>>>>>> >>> >> `org.apache.beam.sdk.private.com.google.common` >>>>>>>> >>> >> directly (this is >>>>>>>> what >>>>>>>> >>> >> they >>>>>>>> >>> >> are rewritten to >>>>>>>> >>> >> >>>>>>>> >>> >> Some benefits >>>>>>>> >>> >> >>>>>>>> >>> >> - much faster >>>>>>>> builds of >>>>>>>> >>> >> other modules >>>>>>>> >>> >> - only one shaded >>>>>>>> uber jar >>>>>>>> >>> >> - rare/no rebuilds >>>>>>>> of the >>>>>>>> >>> >> uber jar >>>>>>>> >>> >> - can use maven >>>>>>>> enforcer to >>>>>>>> >>> >> forbid >>>>>>>> >>> >> imports like >>>>>>>> com.google.common >>>>>>>> >>> >> - configuration >>>>>>>> all in one >>>>>>>> >>> >> place >>>>>>>> >>> >> - no automated >>>>>>>> rewriting of >>>>>>>> >>> >> our real >>>>>>>> >>> >> code, which has led >>>>>>>> to some >>>>>>>> >>> >> major >>>>>>>> >>> >> confusion >>>>>>>> >>> >> - easy to implement >>>>>>>> >>> >> incrementally >>>>>>>> >>> >> >>>>>>>> >>> >> Downsides: >>>>>>>> >>> >> >>>>>>>> >>> >> - plenty of effort >>>>>>>> work to >>>>>>>> >>> >> get there >>>>>>>> >>> >> - unclear how many >>>>>>>> different >>>>>>>> >>> >> such >>>>>>>> >>> >> deps modules we >>>>>>>> need; sharing >>>>>>>> >>> >> them >>>>>>>> >>> >> >>>>>>>> >>> >> could >>>>>>>> >>> >> >>>>>>>> >>> >> get weird >>>>>>>> >>> >> - if we hit a >>>>>>>> roadblock, we >>>>>>>> >>> >> will >>>>>>>> >>> >> have committed a lot >>>>>>>> of time >>>>>>>> >>> >> >>>>>>>> >>> >> Just something I was >>>>>>>> musing as >>>>>>>> >>> >> I spent >>>>>>>> >>> >> another evening >>>>>>>> waiting for >>>>>>>> >>> >> slow >>>>>>>> >>> >> builds to try to >>>>>>>> confirm >>>>>>>> >>> >> changes to >>>>>>>> >>> >> brittle poms. >>>>>>>> >>> >> >>>>>>>> >>> >> Kenn >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> -- >>>>>>>> >>> >> Jean-Baptiste Onofré >>>>>>>> >>> >> jbono...@apache.org >>>>>>>> >>> >> <mailto:jbono...@apache.org> >>>>>>>> >>> >> http://blog.nanthrax.net >>>>>>>> >>> >> <http://blog.nanthrax.net/> >>>>>>>> >>> >> Talend - http://www.talend.com >>>>>>>> >>> >> <http://www.talend.com/> >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> >> >>>>>>>> >>> > >>>>>>>> >>> > >>>>>>>> >>> > >>>>>>>> >>> >>>>>>>> >>> -- >>>>>>>> >>> Jean-Baptiste Onofré >>>>>>>> >>> jbono...@apache.org >>>>>>>> >>> http://blog.nanthrax.net >>>>>>>> >>> Talend - http://www.talend.com >>>>>>>> >> >>>>>>>> >> >>>>>>>> > >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >