Honestly, I dod not even consider Runner and Beam FileSystem dependencies
here, which will most likely also add some complexity to the problem here.
And there might be more.

On Thu, Mar 14, 2019 at 8:53 PM Kenneth Knowles <k...@apache.org> wrote:

> Thanks for the super-detailed explanation and illustration.
>
> I especially think you emphasized something we generally have not
> considered: the end-user pinning a different version of a client lib, and
> the fact that if we vendor the client lib they cannot change the version.
> To me, that makes vendoring client libs kind of a deal-breaker. So I think:
>
> a) yes, solved by vendored (already done)
> b) leaving the IO's version of Guava on the classpath seems the only
> solution, and end-user has to resolve
> c) good point that in the not-on-API-surface case also, we can't really
> tweak anything and we need the end-user to resolve
>
> So the main action item is to remove the top-level Guava version and to
> let IOs determine their Guava version, with resolution of diamond deps when
> needed in test configurations.
>
> Kenn
>
> On Wed, Mar 6, 2019 at 3:18 AM Michael Luckey <adude3...@gmail.com> wrote:
>
>> Unfortunately, I am unable to see, how option 2 could be done from beam
>> side? Or is it meant to delay to the end user?
>>
>> Let me try to elaborate:
>>
>> Assume user implements a pipeline with 2 different IOs, both depending on
>> incompatible versions of e.g. guava. Also, user code has a (direct or
>> transitive) dependency to
>> yet another incompatible version of guava. I ll try to get into more
>> details later, but for now assume
>>
>>      AIO   ----     ParDo    ----    ParDo   ----     Pardo   ----    BIO
>>        |                                          |
>>                     |
>>     guava-X.jar                       guava-Y.jar
>>  guava-Z.jar
>>
>> What currently happens (AFAIU) is during build we do pin the versions of
>> guava to some fixed version [1], which means AIO and BIO are compiled with
>> guava-20, which might or might not
>> be ABI compatible to guava-X and/or guava-Z. (This is the problem these 2
>> users encountered, right?)
>>
>> Now the user trying to run this pipeline has to decide, which version to
>> use. Whatever she chooses the might be incompatibilities, She could do so
>> with maven, and of course also with Gradle or any other build system
>> (hopefully).
>>
>> If we now replace guava.20 with the newest on build, lets say
>> guava-LATEST which happens to be a dependency of CIO, how would that change
>> the game. So is there anything beam could do about it?
>>
>> Now I try to understand those deps in more detail. We have those IOs, and
>> the user code thingy. I assume, beam code switched to vendors guava, so
>> nothing is exposed here, we only have issues on external interfaces as IOs.
>>
>> 1. The user code.
>> Whether guava is used directly or transitive by some used library
>> dependency, there is nothing beam can do about it. The user has to solve
>> this issue by herself. She will probably encounter exactly the same
>> problems as beam within the IOs,
>>
>> 2. IOs
>>
>> a: Beam IO code uses guava, no transitive deps. This is solved by
>> vendoring, right?
>>
>>    SomeIO <--- guava-20.jar           is replaced by
>>  SomeIO <--- vendored_guava_v20.ja
>>
>> b: Beam uses some client lib (e.g. some-client-core) which has a
>> dependency on guava but does not expose any guava class to the outside
>>
>>    SomeIO <---- some-client-lib <---- guava-X.jar
>>
>>  In this case beam 'could' relocate the client lib and 'solve' that guava
>> exposing on the class path, *but* we might be tight to a specific
>> version of the backend. as the user can not replace the 'some-client-core'
>> dependency with a different version (hoping there are no interop issues
>> with beams IO)
>> I guess, we are better off, not doing that? Which unfortuantely means
>> delaying version resolution to the end user, as transitive dependency will
>> end up on the end user runtime classpath and might cause diamond
>> dependency issues.
>>
>> c: Beam uses some client lib (e.g. bigquery-client-core) which has a
>> dependency on guava which does expose guava classes in its external API
>>
>>    SomeIO <---- some-client-lib <---- guava-X.jar
>>       |                                                      |
>>       -----------------------------------------------
>>
>> We could of course also try to shade here with above drawbacks, but this
>> might not even work as the client lib communicates with the backend and
>> might also expose guava there. So the best we could do is, as Kenn stated,
>> to ensure interoperability SomeIO with the client.
>>
>> So to my current understanding I second Kenn here. We should stop forcing
>> a specific version of (of course uncensored) guava and align on IO basis to
>> the requirements of dependencies. Still, on that lower (IO) level we might
>> need to force some version as already there we might have diamond deps on
>> guava. But I think we are better of not to force a beam global version here.
>>
>> Of course, that will force the user to handle the problem, but she needs
>> to do it anyway, and, unfortunately me also fails to see anything we could
>> do better here. (At least current forcing to v20 seems to worsen the
>> problem).
>>
>> [1]
>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L351
>>
>> On Tue, Mar 5, 2019 at 11:23 PM Gleb Kanterov <g...@spotify.com> wrote:
>>
>>> I agree with the points that Kenneth has raised, mainly:
>>>
>>> > In both of the above approaches, diamond dependency problems between
>>> IOs are quite possible.
>>>
>>> Option 1. With having more IO-s in Beam we would start hitting diamond
>>> dependency problem more often. Relocating dependencies will help, but for
>>> this, we should avoid exposing relocated classes to end-users of IO-s. I
>>> can't speak about everything, but in the case of BigtableIO, it only
>>> exposes proto-classes that aren't part of bigtable-client-core, and
>>> shouldn't be relocated.
>>>
>>> Option 2. Without relocation, every other IO can be potentially broken,
>>> and we can solve this problem on a case-by-case basis. In maven
>>> world situation becomes a little better with requireUpperBoundDeps [1] from
>>> maven-enforcer-plugin. I don't know if there is a similar solution for
>>> gradle.
>>>
>>> Option 3. There is a potential future solution for dependency conflicts
>>> between IO-s with Java 9 JPMS [2], however, it could take a while before we
>>> could use it due to compatibility issues.
>>>
>>> As a short term solution, option 2 seems the best, we could go through
>>> known conflicts and see if it's possible to resolve them, potentially
>>> looking into option 1 that would take much more time.
>>>
>>> [1]:
>>> https://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html
>>> [2]: https://en.wikipedia.org/wiki/Java_Platform_Module_System
>>>
>>>
>>> On Mon, Mar 4, 2019 at 4:45 PM Ismaël Mejía <ieme...@gmail.com> wrote:
>>>
>>>> That looks interesting but I am not sure if I understand correctly,
>>>> isn't the problem that the system API (Bigtable, Cassandra, etc)
>>>> exposes guava related stuff? Or in other words, wouldn't the
>>>> transitivie version of guava leak anyway?
>>>> If it does not I am pretty interested on doing this to fix the
>>>> Cassandra IO from leaking too.
>>>> https://issues.apache.org/jira/browse/BEAM-5723
>>>>
>>>> On Thu, Feb 28, 2019 at 5:17 PM Kenneth Knowles <k...@apache.org>
>>>> wrote:
>>>> >
>>>> > If someone is using BigTableIO with bigtable-client-core then having
>>>> BigTableIO and bigtable-client-core both depend on Guava 26.0 is fine,
>>>> right? Specifically, a user of BigTableIO after
>>>> https://github.com/apache/beam/pull/7957 will still have non-vendored
>>>> Guava on the classpath due to the transitive deps of bigtable-client-core.
>>>> >
>>>> > In any case it seems very wrong for the Beam root project to manage
>>>> the version of Guava in BigTableIO since the whole point is to be
>>>> compatible with bigtable-client-core. Would it work to delete our pinned
>>>> Guava version [1] and chase down all the places it breaks, moving Guava
>>>> dependencies local to places where an IO or extension must use it for
>>>> interop? Then you don't need adapters.
>>>> >
>>>> > In both of the above approaches, diamond dependency problems between
>>>> IOs are quite possible.
>>>> >
>>>> > I don't know if we can do better. For example, producing a
>>>> bigtable-client-core where we have relocated Guava internally and using
>>>> that could really be an interop nightmare as things that look like the same
>>>> type would not be. Less likely to be broken would be bigtable-client-core
>>>> entirely relocated and vendored, but generally IO connectors exchange
>>>> objects with users and the users would have to use the relocated versions,
>>>> so that's gross.
>>>> >
>>>> > Kenn
>>>> >
>>>> > [1]
>>>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L353
>>>> >
>>>> >
>>>> > On Thu, Feb 28, 2019 at 2:29 AM Gleb Kanterov <g...@spotify.com>
>>>> wrote:
>>>> >>
>>>> >> For the past week, two independent people have asked me if I can
>>>> help with guava NoSuchMethodError in BigtableIO. It turns out we still have
>>>> a potential problem with dependencies that don't vendor guava, in this
>>>> case, it was bigtable-client-core that depends on guava-26.0. However, the
>>>> root cause of the classpath problem was in the usage of a deprecated method
>>>> from non-vendored guava in BigtableServiceClientImpl in the code path where
>>>> we integrate with bigtable client.
>>>> >>
>>>> >> I created apache/beam#7957 [1] to fix that. There few other IO-s
>>>> where we use non-vendored guava that we can fix using adapters.
>>>> >>
>>>> >> And there is an unknown number of conflicts between guava versions
>>>> in our dependencies that don't vendor it, that as I understand it, could be
>>>> fixed by relocating them, in a similar way we do for Calcite [2].
>>>> >>
>>>> >> [1]: https://github.com/apache/beam/pull/7957
>>>> >> [2]:
>>>> https://github.com/apache/beam/blob/61de62ecbe8658de866280a8976030a0cb877041/sdks/java/extensions/sql/build.gradle#L30-L39
>>>> >>
>>>> >> Gleb
>>>> >>
>>>> >> On Sun, Jan 20, 2019 at 11:43 AM Gleb Kanterov <g...@spotify.com>
>>>> wrote:
>>>> >>>
>>>> >>> I didn't look deep into it, but it seems we can put
>>>> .idea/codeInsightSettings.xml into our repository where we blacklist
>>>> packages from auto-import. See an example in
>>>> JetBrains/kotlin/.idea/codeInsightSettings.xml.
>>>> >>>
>>>> >>> On Sat, Jan 19, 2019 at 8:03 PM Reuven Lax <re...@google.com>
>>>> wrote:
>>>> >>>>
>>>> >>>> Bad IDEs automatically generate the wrong import. I think we need
>>>> to automatically prevent this, otherwise the bad imports will inevitably
>>>> slip back in.
>>>> >>>>
>>>> >>>> Reuven
>>>> >>>>
>>>> >>>> On Tue, Jan 15, 2019 at 2:54 AM Łukasz Gajowy <
>>>> lukasz.gaj...@gmail.com> wrote:
>>>> >>>>>
>>>> >>>>> Great news. Thanks all for this work!
>>>> >>>>>
>>>> >>>>> +1 to enforcing this on dependency level as Kenn suggested.
>>>> >>>>>
>>>> >>>>> Łukasz
>>>> >>>>>
>>>> >>>>> wt., 15 sty 2019 o 01:18 Kenneth Knowles <k...@apache.org>
>>>> napisał(a):
>>>> >>>>>>
>>>> >>>>>> We can enforce at the dependency level, since it is a compile
>>>> error. I think some IDEs and build tools may allow the compile-time
>>>> classpath to get polluted by transitive runtime deps, so protecting against
>>>> bad imports is also a good idea.
>>>> >>>>>>
>>>> >>>>>> Kenn
>>>> >>>>>>
>>>> >>>>>> On Mon, Jan 14, 2019 at 8:42 AM Ismaël Mejía <ieme...@gmail.com>
>>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> Not yet, we need to add that too, there are still some tasks to
>>>> be
>>>> >>>>>>> done like improve the contribution guide with this info, and
>>>> document
>>>> >>>>>>> how to  generate a src build artifact locally since I doubt we
>>>> can
>>>> >>>>>>> publish that into Apache for copyright reasons.
>>>> >>>>>>> I will message in the future for awareness for awareness when
>>>> most of
>>>> >>>>>>> the pending tasks are finished.
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Jan 14, 2019 at 3:51 PM Maximilian Michels <
>>>> m...@apache.org> wrote:
>>>> >>>>>>> >
>>>> >>>>>>> > Thanks for the heads up, Ismaël! Great to see the vendored
>>>> Guava version is used
>>>> >>>>>>> > everywhere now.
>>>> >>>>>>> >
>>>> >>>>>>> > Do we already have a Checkstyle rule that prevents people
>>>> from using the
>>>> >>>>>>> > unvendored Guava? If not, such a rule could be useful because
>>>> it's almost
>>>> >>>>>>> > inevitable that the unvedored Guava will slip back in.
>>>> >>>>>>> >
>>>> >>>>>>> > Cheers,
>>>> >>>>>>> > Max
>>>> >>>>>>> >
>>>> >>>>>>> > On 14.01.19 05:55, Ismaël Mejía wrote:
>>>> >>>>>>> > > We merged today the PR [1] that changes most of the code to
>>>> use our
>>>> >>>>>>> > > new guava vendored dependency. In practice it means that
>>>> most of the
>>>> >>>>>>> > > imports of the classes were changed from
>>>> `com.google.common.` to
>>>> >>>>>>> > > `org.apache.beam.vendor.guava.v20_0.com.google.common.`
>>>> >>>>>>> > >
>>>> >>>>>>> > > This is a great improvement to fix a long existing problem
>>>> of guava
>>>> >>>>>>> > > leaking through some Beam modules. This also reduces the
>>>> size of most
>>>> >>>>>>> > > jars in the project because they don't need to relocate and
>>>> include
>>>> >>>>>>> > > guava anymore, they just use the vendored dependency.
>>>> >>>>>>> > >
>>>> >>>>>>> > > Kudos to Kenn Knowles, Lukasz Cwik, Scott Wegner and the
>>>> others that
>>>> >>>>>>> > > worked (are working) to make this possible.
>>>> >>>>>>> > >
>>>> >>>>>>> > > Sadly as a side effect of the merge of this PR multiple PRs
>>>> were
>>>> >>>>>>> > > broken so please review if yours was and do a rebase and
>>>> fix the
>>>> >>>>>>> > > imports to use the new vendored dependency. Sorry for the
>>>> >>>>>>> > > inconvenience. From now one all uses of guava should use
>>>> the vendored
>>>> >>>>>>> > > version. Expect some updates in the docs.
>>>> >>>>>>> > >
>>>> >>>>>>> > > [1]  https://github.com/apache/beam/pull/6809
>>>> >>>>>>> > >
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> --
>>>> >>> Cheers,
>>>> >>> Gleb
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Cheers,
>>>> >> Gleb
>>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Gleb
>>>
>>

Reply via email to