Thanks for fixing the duplicate root issue, PR is merged.

I cannot run RemoteExecutionTest from the same commit (Intellij v2018.1.6).
Here are some of the class loading errors before I give up adding
dependencies manually :(

java.lang.NoClassDefFoundError:
org/apache/beam/vendor/grpc/v1/io/grpc/BindableService
java.lang.NoClassDefFoundError: com/google/protobuf/ByteString
java.lang.NoClassDefFoundError:
org/apache/beam/vendor/sdk/v2/sdk/extensions/protobuf/ByteStringCoder
Caused by: java.lang.ClassNotFoundException: net.bytebuddy.NamingStrategy

Our Intellij results are all over the board, possibly due to different
versions or other settings/plugins? It will be important to sort it out to
make the contributor experience smoother.

Thanks,
Thomas


On Wed, Jul 18, 2018 at 9:29 AM Lukasz Cwik <lc...@google.com> wrote:

> Ismael, the SDK should perform the pipeline translation to proto because I
> expect the flow to be:
> User Code -> SDK -> Proto Translation -> Job API -> Runner
> I don't expect "runners" to live within the users process anymore
> (excluding the direct runner). There will be one portable "runner" and it
> will be responsible for communicating with the job management APIs. It
> shouldn't be called a runner but for backwards compatibility it will behave
> like a runner does today. Flink/Spark/... will all live on the other side
> of the job management API.
>
> Thomas, I can run RemoteExectuionTest from commit
> ae2bebaf8b277e99840fa63f1b95d828f2093d16 without needing to modify the
> project/module structure in Intellij. Adding the jars manually only helps
> with code completion.
>
> https://github.com/apache/beam/pull/5977 works around the duplicate
> content root issue in Intellij. I have also run into the -Werror issue
> occasionally and don't know any fix or why it gets triggered as it doesn't
> happen to me all the time.
>
> On Tue, Jul 17, 2018 at 7:01 PM Thomas Weise <t...@apache.org> wrote:
>
>> Thanks, the classpath order matters indeed.
>>
>> Still not able to run RemoteExecutionTest, but I was able to get the
>> Flink portable test to work by adding the following to the *top* of the
>> dependency list of *beam-runners-flink_2.11_test*
>>
>>
>> vendor/sdks-java-extensions-protobuf/build/libs/beam-vendor-sdks-java-extensions-protobuf-2.6.0-SNAPSHOT.jar
>> model/fn-execution/build/libs/beam-model-fn-execution-2.6.0-SNAPSHOT.jar
>>
>>
>> On Tue, Jul 17, 2018 at 6:00 PM Ankur Goenka <goe...@google.com> wrote:
>>
>>> Yes, I am able to run it.
>>>
>>> For tests, you also need to add dependencies to
>>> ":beam-runners-java-fn-execution/beam-runners-java-fn-execution_*test*"
>>> module.
>>>
>>> Also, I only added
>>> :beam-model-job-management-2.6.0-SNAPSHOT.jar
>>> :beam-model-fn-execution-2.6.0-SNAPSHOT.jar
>>> to the dependencies manually so not sure if you want to add
>>> io.grpc:grpc-core:1.12.0 and com.google.protobuf:protobuf-java:3.5.1 to
>>> the dependencies.
>>>
>>> Note, you need to move them up in the dependencies list.
>>>
>>>
>>> On Tue, Jul 17, 2018 at 5:54 PM Thomas Weise <t...@apache.org> wrote:
>>>
>>>> Are you able to
>>>> run org.apache.beam.runners.fnexecution.control.RemoteExecutionTest from
>>>> within Intellij ?
>>>>
>>>> I can get the compile errors to disappear by adding
>>>> beam-model-job-management-2.6.0-SNAPSHOT.jar, io.grpc:grpc-core:1.12.0
>>>> and com.google.protobuf:protobuf-java:3.5.1
>>>>
>>>> Running the test still fails since other dependencies are missing.
>>>>
>>>>
>>>> On Tue, Jul 17, 2018 at 4:02 PM Ankur Goenka <goe...@google.com> wrote:
>>>>
>>>>> For reference:
>>>>> I was able to make intellij work with the master by doing following
>>>>> steps
>>>>>
>>>>>    1. Remove module :beam:vendor-sdks-java-extensions-protobuf from
>>>>>    intellij.
>>>>>    2. Adding
>>>>>    
>>>>> :beam-model-fn-execution/build/libs/beam-model-fn-execution-2.6.0-SNAPSHOT.jar
>>>>>    and 
>>>>> :beam-model-job-management/build/libs/beam-model-job-management-2.6.0-SNAPSHOT.jar
>>>>>    to the appropriate modules at the top of the dependency list.
>>>>>
>>>>>
>>>>> On Tue, Jul 17, 2018 at 2:29 PM Thomas Weise <t...@apache.org> wrote:
>>>>>
>>>>>> Adding the external jar in Intellij (2018.1) currently fails due to a
>>>>>> duplicate source directory (sdks/java/extensions/protobuf/src/main/java).
>>>>>>
>>>>>> The build as such also fails, with:  error: warnings found and
>>>>>> -Werror specified
>>>>>>
>>>>>> Ismaël found removing
>>>>>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L538
>>>>>> as workaround.
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 12, 2018 at 1:55 PM Ismaël Mejía <ieme...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Seems reasonable, but why exactly may we need the model (or protobuf
>>>>>>> related things) in the future in the SDK ? wasn’t it supposed to be
>>>>>>> translated into the Pipeline proto representation via the runners
>>>>>>> (and
>>>>>>> in this case the dep reside in the runner side) ?
>>>>>>> On Thu, Jul 12, 2018 at 2:50 AM Lukasz Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Got a fix[1] for Andrews issue which turned out to be a release
>>>>>>> blocker since it broke performing the release. Also fixed several minor
>>>>>>> things like javadoc that were wrong with the release. Solving it 
>>>>>>> allowed me
>>>>>>> to do the publishing in parallel and cut the release time from 20+ mins 
>>>>>>> to
>>>>>>> 8 mins on my machine.
>>>>>>> >
>>>>>>> > 1: https://github.com/apache/beam/pull/5936
>>>>>>> >
>>>>>>> > On Wed, Jul 11, 2018 at 3:51 PM Andrew Pilloud <
>>>>>>> apill...@google.com> wrote:
>>>>>>> >>
>>>>>>> >> We discussed this in person, sounds like my issue is known and
>>>>>>> will be fixed shortly. I'm running builds with '-Ppublishing' because I
>>>>>>> need to generate release artifacts for bundling the Beam SQL shell with 
>>>>>>> the
>>>>>>> Google Cloud SDK. Hope to eventually just use the Beam release, but we 
>>>>>>> are
>>>>>>> currently cutting a release off master every week to quickly iterate on 
>>>>>>> bug
>>>>>>> fixes.
>>>>>>> >>
>>>>>>> >> Andrew
>>>>>>> >>
>>>>>>> >> On Wed, Jul 11, 2018 at 1:39 PM Lukasz Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >>>
>>>>>>> >>> Andrew, to my knowledge it seems as though your running into
>>>>>>> BEAM-4744, is there a reason you need to specify -Ppublishing?
>>>>>>> >>>
>>>>>>> >>> No particular reason to using ByteString within ByteKey and
>>>>>>> TextSource. Note that we currently do shade away protobuf in 
>>>>>>> sdks/java/core
>>>>>>> so we could either migrate to using a vendored version or re-implement 
>>>>>>> the
>>>>>>> functionality to not use ByteString. Note that sdks/java/core can now
>>>>>>> dependend on the model/* classes and perform the Pipeline -> Proto
>>>>>>> translation as this will be needed to support portability efforts so I
>>>>>>> would prefer just migrating to use the vendored versions of the code. 
>>>>>>> Filed
>>>>>>> BEAM-4766.
>>>>>>> >>>
>>>>>>> >>> As for the IO module, I was referring to the upstream
>>>>>>> bigtable/bigquery/... libraries vended by Google. If they trimmed their 
>>>>>>> API
>>>>>>> surface to not expose gRPC or protobuf, then we wouldn't have to worry
>>>>>>> about having the shading logic within 
>>>>>>> sdks/java/io/google-cloud-platform. I
>>>>>>> know that this will be impossible for some connectors without backwards
>>>>>>> incompatible changes since they exposed protobuf on their API surface. I
>>>>>>> know that Chamikara was looking to shade this away in the
>>>>>>> sdks/java/io/google-cloud-platform but only had limited success in the 
>>>>>>> past.
>>>>>>> >>>
>>>>>>> >>> On Wed, Jul 11, 2018 at 1:14 PM Ismaël Mejía <ieme...@gmail.com>
>>>>>>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> This is great news in particular for runners (Spark) where the
>>>>>>> leaking of some grpc subdependencies caused stability issues and 
>>>>>>> required
>>>>>>> extra shading. Great !
>>>>>>> >>>>
>>>>>>> >>>> About the other modules
>>>>>>> >>>>
>>>>>>> >>>> > Note, these are the following modules that still depend on
>>>>>>> protobuf that are shaded away and could move to use a vendored variant 
>>>>>>> of
>>>>>>> protobuf:
>>>>>>> >>>> > * sdks/java/core
>>>>>>> >>>> > * sdks/java/extensions/sql
>>>>>>> >>>>
>>>>>>> >>>> For sdks/java/core the dependency in protobuf seems to be
>>>>>>> minor, from a quick look it seems that it is only used to import 
>>>>>>> ByteString
>>>>>>> in two classes: ByteKey and TextSource so hopefully we can rewrite both 
>>>>>>> and
>>>>>>> get rid of the dependency altogether (making core smaller which is 
>>>>>>> always a
>>>>>>> win).
>>>>>>> >>>> Can we fill a JIRA for this or do I miss other reasons to
>>>>>>> depend on protobuf in core?
>>>>>>> >>>>
>>>>>>> >>>> For sdks/java/extensions/sql I don’t know if I am missing
>>>>>>> something, but I don’t see any code use of protobuf and I doubt that
>>>>>>> calcite uses protobuf so maybe it is there just because it was leaking 
>>>>>>> from
>>>>>>> somewhere else in Beam, we should better check this first.
>>>>>>> >>>>
>>>>>>> >>>> > These modules expose protobuf because it is part of the API
>>>>>>> surface:
>>>>>>> >>>> > * sdks/java/extensions/protobuf
>>>>>>> >>>> > * sdks/java/io/google-cloud-platform (I believe that gRPC
>>>>>>> could be shaded here but preferrably the IO module would do it so we
>>>>>>> wouldn't have this maintenance burden.)
>>>>>>> >>>>
>>>>>>> >>>> Can you please elaborate on ‘but preferrably the IO module
>>>>>>> would do it so we wouldn't have this maintenance burden’. I remember 
>>>>>>> there
>>>>>>> was an issue when running the examples in the spark runner examples 
>>>>>>> because
>>>>>>> of sdks/java/io/google-cloud-platform leaking netty via gRPC (BEAM-3519)
>>>>>>> [Note that this is hidden at this moment because of pure luck Spark 
>>>>>>> 2.3.x
>>>>>>> and Beam are aligned on netty version but this can change in the future 
>>>>>>> so
>>>>>>> hopefully this can be shaded/controlled].
>>>>>>> >>>>
>>>>>>> >>>> On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud <
>>>>>>> apill...@google.com> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> This is really cool and should cut down our artifact size
>>>>>>> significantly! Thanks Luke!
>>>>>>> >>>>>
>>>>>>> >>>>> I am running into one issue after this: builds with the
>>>>>>> publishing flag no longer work. (We run './gradlew -Ppublishing 
>>>>>>> shadowJar'
>>>>>>> to generate release artifacts for the Beam SQL shell.) I get a bunch of
>>>>>>> errors like this:
>>>>>>> >>>>>
>>>>>>> >>>>>
>>>>>>> model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148:
>>>>>>> error: no suitable method found for
>>>>>>> readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser<Pipeline>,ExtensionRegistryLite)
>>>>>>> >>>>>
>>>>>>> >>>>> Is there something I need to change in my build?
>>>>>>> >>>>>
>>>>>>> >>>>> Andrew
>>>>>>> >>>>>
>>>>>>> >>>>> On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik <lc...@google.com>
>>>>>>> wrote:
>>>>>>> >>>>>>
>>>>>>> >>>>>> With the merge of PR #5594[1], we started shading all gRPC /
>>>>>>> Protobuf dependencies within all the modules that depended on the 
>>>>>>> model/*
>>>>>>> dependencies by vendoring them. The vendored versions are built and
>>>>>>> packaged into the model jars (they should be separated out once I figure
>>>>>>> out how to generate proto code using a shaded import path). Note that 
>>>>>>> this
>>>>>>> cleaned up several issues where we were incorrectly built shaded jars
>>>>>>> without repackaging in some locations or the shading process was 
>>>>>>> corrupting
>>>>>>> the contents of some of the jars.
>>>>>>> >>>>>>
>>>>>>> >>>>>> Note that the majority of the code base (especially related
>>>>>>> to portability) should be using imports under the
>>>>>>> org.apache.beam.vendor.protobuf.v3 or org.apache.beam.vendor.grpc.v1 
>>>>>>> paths.
>>>>>>> I have yet to figure out a clean way to get Intellij to recognize these
>>>>>>> vendored paths. My only solution so far has been to manually add one of 
>>>>>>> the
>>>>>>> built model jars to the compile classpath of the module being worked on 
>>>>>>> in
>>>>>>> Intellij as described here[2]. I would greatly appreciate some ideas on 
>>>>>>> how
>>>>>>> to improve this integration because from a few attempts configuring the
>>>>>>> intellij gradle pluglin scope sections didn't produce the result that I 
>>>>>>> was
>>>>>>> expecting.
>>>>>>> >>>>>>
>>>>>>> >>>>>> I also added a simple test task
>>>>>>> validateShadedJarDoesntLeakNonOrgApacheBeamClasses that validates the
>>>>>>> shaded jar doesn't contain classes without repackaging which we should
>>>>>>> apply to any module that performs shading to ensure that classes are
>>>>>>> relocated and we don't accidentally expose stuff. I filed BEAM-4753[3] 
>>>>>>> to
>>>>>>> this end.
>>>>>>> >>>>>>
>>>>>>> >>>>>> Note, these are the following modules that still depend on
>>>>>>> protobuf that are shaded away and could move to use a vendored variant 
>>>>>>> of
>>>>>>> protobuf:
>>>>>>> >>>>>> * sdks/java/core
>>>>>>> >>>>>> * sdks/java/extensions/sql
>>>>>>> >>>>>>
>>>>>>> >>>>>> These modules expose protobuf because it is part of the API
>>>>>>> surface:
>>>>>>> >>>>>> * sdks/java/extensions/protobuf
>>>>>>> >>>>>> * sdks/java/io/google-cloud-platform (I believe that gRPC
>>>>>>> could be shaded here but preferrably the IO module would do it so we
>>>>>>> wouldn't have this maintenance burden.)
>>>>>>> >>>>>>
>>>>>>> >>>>>> 1: https://github.com/apache/beam/pull/5594
>>>>>>> >>>>>> 2:
>>>>>>> https://stackoverflow.com/questions/1051640/correct-way-to-add-external-jars-lib-jar-to-an-intellij-idea-project
>>>>>>> >>>>>> 3: https://issues.apache.org/jira/browse/BEAM-4753
>>>>>>>
>>>>>>

Reply via email to