This is great news in particular for runners (Spark) where the leaking of
some grpc subdependencies caused stability issues and required extra
shading. Great !

About the other modules

> Note, these are the following modules that still depend on protobuf that
are shaded away and could move to use a vendored variant of protobuf:
> * sdks/java/core
> * sdks/java/extensions/sql

For sdks/java/core the dependency in protobuf seems to be minor, from a
quick look it seems that it is only used to import ByteString in two
classes: ByteKey and TextSource so hopefully we can rewrite both and get
rid of the dependency altogether (making core smaller which is always a
win).
Can we fill a JIRA for this or do I miss other reasons to depend on
protobuf in core?

For sdks/java/extensions/sql I don’t know if I am missing something, but I
don’t see any code use of protobuf and I doubt that calcite uses protobuf
so maybe it is there just because it was leaking from somewhere else in
Beam, we should better check this first.

> These modules expose protobuf because it is part of the API surface:
> * sdks/java/extensions/protobuf
> * sdks/java/io/google-cloud-platform (I believe that gRPC could be shaded
here but preferrably the IO module would do it so we wouldn't have this
maintenance burden.)

Can you please elaborate on ‘but preferrably the IO module would do it so
we wouldn't have this maintenance burden’. I remember there was an issue
when running the examples in the spark runner examples because of
sdks/java/io/google-cloud-platform leaking netty via gRPC (BEAM-3519) [Note
that this is hidden at this moment because of pure luck Spark 2.3.x and
Beam are aligned on netty version but this can change in the future so
hopefully this can be shaded/controlled].

On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud <apill...@google.com> wrote:

> This is really cool and should cut down our artifact size significantly!
> Thanks Luke!
>
> I am running into one issue after this: builds with the publishing flag no
> longer work. (We run './gradlew -Ppublishing shadowJar' to generate release
> artifacts for the Beam SQL shell.) I get a bunch of errors like this:
>
> model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148:
> error: no suitable method found for
> readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser<Pipeline>,ExtensionRegistryLite)
>
> Is there something I need to change in my build?
>
> Andrew
>
> On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik <lc...@google.com> wrote:
>
>> With the merge of PR #5594[1], we started shading all gRPC / Protobuf
>> dependencies within all the modules that depended on the model/*
>> dependencies by vendoring them. The vendored versions are built and
>> packaged into the model jars (they should be separated out once I figure
>> out how to generate proto code using a shaded import path). Note that this
>> cleaned up several issues where we were incorrectly built shaded jars
>> without repackaging in some locations or the shading process was corrupting
>> the contents of some of the jars.
>>
>> Note that the majority of the code base (especially related to
>> portability) should be using imports under the
>> org.apache.beam.vendor.protobuf.v3 or org.apache.beam.vendor.grpc.v1 paths.
>> I have yet to figure out a clean way to get Intellij to recognize these
>> vendored paths. My only solution so far has been to manually add one of the
>> built model jars to the compile classpath of the module being worked on in
>> Intellij as described here[2]. I would greatly appreciate some ideas on how
>> to improve this integration because from a few attempts configuring the
>> intellij gradle pluglin scope sections didn't produce the result that I was
>> expecting.
>>
>> I also added a simple test task
>> *validateShadedJarDoesntLeakNonOrgApacheBeamClasses* that validates the
>> shaded jar doesn't contain classes without repackaging which we should
>> apply to any module that performs shading to ensure that classes are
>> relocated and we don't accidentally expose stuff. I filed BEAM-4753[3] to
>> this end.
>>
>> Note, these are the following modules that still depend on protobuf that
>> are shaded away and could move to use a vendored variant of protobuf:
>> * sdks/java/core
>> * sdks/java/extensions/sql
>>
>> These modules expose protobuf because it is part of the API surface:
>> * sdks/java/extensions/protobuf
>> * sdks/java/io/google-cloud-platform (I believe that gRPC could be shaded
>> here but preferrably the IO module would do it so we wouldn't have this
>> maintenance burden.)
>>
>> 1: https://github.com/apache/beam/pull/5594
>> 2:
>> https://stackoverflow.com/questions/1051640/correct-way-to-add-external-jars-lib-jar-to-an-intellij-idea-project
>> 3: https://issues.apache.org/jira/browse/BEAM-4753
>>
>

Reply via email to