FYI: The beam version just got updated today to 2.7-SNAPSHOT <https://github.com/apache/beam/commit/4c76dad8d5c1b6b31e287907bd42604341794f9a>. This require re-adding the dependency jars for 2.7-SNAPSHOT version to the projects.
On Wed, Jul 18, 2018 at 1:22 PM Lukasz Cwik <lc...@google.com> wrote: > Are you delegating your test runs to Gradle or using Intellij's test > runner? > > I have been delegating my test runs to Gradle and that has been working > well for me. The platform test runner fails due to the missing classes as > you mentioned. > > I also spent two hours trying to muck around with the Gradle Idea plugin > to get it to use the shaded jars instead of the output classes directly > without much success. Reached out to the Gradle community forum[1] to see > what they say. > > 1: > https://discuss.gradle.org/t/how-to-get-intellij-to-use-module-output-jars-instead-of-output-classes/27794 > > On Wed, Jul 18, 2018 at 11:06 AM Thomas Weise <t...@apache.org> wrote: > >> Thanks for fixing the duplicate root issue, PR is merged. >> >> I cannot run RemoteExecutionTest from the same commit (Intellij >> v2018.1.6). Here are some of the class loading errors before I give up >> adding dependencies manually :( >> >> java.lang.NoClassDefFoundError: >> org/apache/beam/vendor/grpc/v1/io/grpc/BindableService >> java.lang.NoClassDefFoundError: com/google/protobuf/ByteString >> java.lang.NoClassDefFoundError: >> org/apache/beam/vendor/sdk/v2/sdk/extensions/protobuf/ByteStringCoder >> Caused by: java.lang.ClassNotFoundException: net.bytebuddy.NamingStrategy >> >> Our Intellij results are all over the board, possibly due to different >> versions or other settings/plugins? It will be important to sort it out to >> make the contributor experience smoother. >> >> Thanks, >> Thomas >> >> >> On Wed, Jul 18, 2018 at 9:29 AM Lukasz Cwik <lc...@google.com> wrote: >> >>> Ismael, the SDK should perform the pipeline translation to proto because >>> I expect the flow to be: >>> User Code -> SDK -> Proto Translation -> Job API -> Runner >>> I don't expect "runners" to live within the users process anymore >>> (excluding the direct runner). There will be one portable "runner" and it >>> will be responsible for communicating with the job management APIs. It >>> shouldn't be called a runner but for backwards compatibility it will behave >>> like a runner does today. Flink/Spark/... will all live on the other side >>> of the job management API. >>> >>> Thomas, I can run RemoteExectuionTest from commit >>> ae2bebaf8b277e99840fa63f1b95d828f2093d16 without needing to modify the >>> project/module structure in Intellij. Adding the jars manually only helps >>> with code completion. >>> >>> https://github.com/apache/beam/pull/5977 works around the duplicate >>> content root issue in Intellij. I have also run into the -Werror issue >>> occasionally and don't know any fix or why it gets triggered as it doesn't >>> happen to me all the time. >>> >>> On Tue, Jul 17, 2018 at 7:01 PM Thomas Weise <t...@apache.org> wrote: >>> >>>> Thanks, the classpath order matters indeed. >>>> >>>> Still not able to run RemoteExecutionTest, but I was able to get the >>>> Flink portable test to work by adding the following to the *top* of >>>> the dependency list of *beam-runners-flink_2.11_test* >>>> >>>> >>>> vendor/sdks-java-extensions-protobuf/build/libs/beam-vendor-sdks-java-extensions-protobuf-2.6.0-SNAPSHOT.jar >>>> model/fn-execution/build/libs/beam-model-fn-execution-2.6.0-SNAPSHOT.jar >>>> >>>> >>>> On Tue, Jul 17, 2018 at 6:00 PM Ankur Goenka <goe...@google.com> wrote: >>>> >>>>> Yes, I am able to run it. >>>>> >>>>> For tests, you also need to add dependencies to >>>>> ":beam-runners-java-fn-execution/beam-runners-java-fn-execution_*test*" >>>>> module. >>>>> >>>>> Also, I only added >>>>> :beam-model-job-management-2.6.0-SNAPSHOT.jar >>>>> :beam-model-fn-execution-2.6.0-SNAPSHOT.jar >>>>> to the dependencies manually so not sure if you want to add >>>>> io.grpc:grpc-core:1.12.0 and com.google.protobuf:protobuf-java:3.5.1 >>>>> to the dependencies. >>>>> >>>>> Note, you need to move them up in the dependencies list. >>>>> >>>>> >>>>> On Tue, Jul 17, 2018 at 5:54 PM Thomas Weise <t...@apache.org> wrote: >>>>> >>>>>> Are you able to >>>>>> run org.apache.beam.runners.fnexecution.control.RemoteExecutionTest from >>>>>> within Intellij ? >>>>>> >>>>>> I can get the compile errors to disappear by adding >>>>>> beam-model-job-management-2.6.0-SNAPSHOT.jar, io.grpc:grpc-core:1.12.0 >>>>>> and com.google.protobuf:protobuf-java:3.5.1 >>>>>> >>>>>> Running the test still fails since other dependencies are missing. >>>>>> >>>>>> >>>>>> On Tue, Jul 17, 2018 at 4:02 PM Ankur Goenka <goe...@google.com> >>>>>> wrote: >>>>>> >>>>>>> For reference: >>>>>>> I was able to make intellij work with the master by doing following >>>>>>> steps >>>>>>> >>>>>>> 1. Remove module :beam:vendor-sdks-java-extensions-protobuf from >>>>>>> intellij. >>>>>>> 2. Adding >>>>>>> >>>>>>> :beam-model-fn-execution/build/libs/beam-model-fn-execution-2.6.0-SNAPSHOT.jar >>>>>>> and >>>>>>> :beam-model-job-management/build/libs/beam-model-job-management-2.6.0-SNAPSHOT.jar >>>>>>> to the appropriate modules at the top of the dependency list. >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 17, 2018 at 2:29 PM Thomas Weise <t...@apache.org> wrote: >>>>>>> >>>>>>>> Adding the external jar in Intellij (2018.1) currently fails due to >>>>>>>> a duplicate source directory >>>>>>>> (sdks/java/extensions/protobuf/src/main/java). >>>>>>>> >>>>>>>> The build as such also fails, with: error: warnings found and >>>>>>>> -Werror specified >>>>>>>> >>>>>>>> Ismaël found removing >>>>>>>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L538 >>>>>>>> as workaround. >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jul 12, 2018 at 1:55 PM Ismaël Mejía <ieme...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Seems reasonable, but why exactly may we need the model (or >>>>>>>>> protobuf >>>>>>>>> related things) in the future in the SDK ? wasn’t it supposed to be >>>>>>>>> translated into the Pipeline proto representation via the runners >>>>>>>>> (and >>>>>>>>> in this case the dep reside in the runner side) ? >>>>>>>>> On Thu, Jul 12, 2018 at 2:50 AM Lukasz Cwik <lc...@google.com> >>>>>>>>> wrote: >>>>>>>>> > >>>>>>>>> > Got a fix[1] for Andrews issue which turned out to be a release >>>>>>>>> blocker since it broke performing the release. Also fixed several >>>>>>>>> minor >>>>>>>>> things like javadoc that were wrong with the release. Solving it >>>>>>>>> allowed me >>>>>>>>> to do the publishing in parallel and cut the release time from 20+ >>>>>>>>> mins to >>>>>>>>> 8 mins on my machine. >>>>>>>>> > >>>>>>>>> > 1: https://github.com/apache/beam/pull/5936 >>>>>>>>> > >>>>>>>>> > On Wed, Jul 11, 2018 at 3:51 PM Andrew Pilloud < >>>>>>>>> apill...@google.com> wrote: >>>>>>>>> >> >>>>>>>>> >> We discussed this in person, sounds like my issue is known and >>>>>>>>> will be fixed shortly. I'm running builds with '-Ppublishing' because >>>>>>>>> I >>>>>>>>> need to generate release artifacts for bundling the Beam SQL shell >>>>>>>>> with the >>>>>>>>> Google Cloud SDK. Hope to eventually just use the Beam release, but >>>>>>>>> we are >>>>>>>>> currently cutting a release off master every week to quickly iterate >>>>>>>>> on bug >>>>>>>>> fixes. >>>>>>>>> >> >>>>>>>>> >> Andrew >>>>>>>>> >> >>>>>>>>> >> On Wed, Jul 11, 2018 at 1:39 PM Lukasz Cwik <lc...@google.com> >>>>>>>>> wrote: >>>>>>>>> >>> >>>>>>>>> >>> Andrew, to my knowledge it seems as though your running into >>>>>>>>> BEAM-4744, is there a reason you need to specify -Ppublishing? >>>>>>>>> >>> >>>>>>>>> >>> No particular reason to using ByteString within ByteKey and >>>>>>>>> TextSource. Note that we currently do shade away protobuf in >>>>>>>>> sdks/java/core >>>>>>>>> so we could either migrate to using a vendored version or >>>>>>>>> re-implement the >>>>>>>>> functionality to not use ByteString. Note that sdks/java/core can now >>>>>>>>> dependend on the model/* classes and perform the Pipeline -> Proto >>>>>>>>> translation as this will be needed to support portability efforts so I >>>>>>>>> would prefer just migrating to use the vendored versions of the code. >>>>>>>>> Filed >>>>>>>>> BEAM-4766. >>>>>>>>> >>> >>>>>>>>> >>> As for the IO module, I was referring to the upstream >>>>>>>>> bigtable/bigquery/... libraries vended by Google. If they trimmed >>>>>>>>> their API >>>>>>>>> surface to not expose gRPC or protobuf, then we wouldn't have to worry >>>>>>>>> about having the shading logic within >>>>>>>>> sdks/java/io/google-cloud-platform. I >>>>>>>>> know that this will be impossible for some connectors without >>>>>>>>> backwards >>>>>>>>> incompatible changes since they exposed protobuf on their API >>>>>>>>> surface. I >>>>>>>>> know that Chamikara was looking to shade this away in the >>>>>>>>> sdks/java/io/google-cloud-platform but only had limited success in >>>>>>>>> the past. >>>>>>>>> >>> >>>>>>>>> >>> On Wed, Jul 11, 2018 at 1:14 PM Ismaël Mejía < >>>>>>>>> ieme...@gmail.com> wrote: >>>>>>>>> >>>> >>>>>>>>> >>>> This is great news in particular for runners (Spark) where >>>>>>>>> the leaking of some grpc subdependencies caused stability issues and >>>>>>>>> required extra shading. Great ! >>>>>>>>> >>>> >>>>>>>>> >>>> About the other modules >>>>>>>>> >>>> >>>>>>>>> >>>> > Note, these are the following modules that still depend on >>>>>>>>> protobuf that are shaded away and could move to use a vendored >>>>>>>>> variant of >>>>>>>>> protobuf: >>>>>>>>> >>>> > * sdks/java/core >>>>>>>>> >>>> > * sdks/java/extensions/sql >>>>>>>>> >>>> >>>>>>>>> >>>> For sdks/java/core the dependency in protobuf seems to be >>>>>>>>> minor, from a quick look it seems that it is only used to import >>>>>>>>> ByteString >>>>>>>>> in two classes: ByteKey and TextSource so hopefully we can rewrite >>>>>>>>> both and >>>>>>>>> get rid of the dependency altogether (making core smaller which is >>>>>>>>> always a >>>>>>>>> win). >>>>>>>>> >>>> Can we fill a JIRA for this or do I miss other reasons to >>>>>>>>> depend on protobuf in core? >>>>>>>>> >>>> >>>>>>>>> >>>> For sdks/java/extensions/sql I don’t know if I am missing >>>>>>>>> something, but I don’t see any code use of protobuf and I doubt that >>>>>>>>> calcite uses protobuf so maybe it is there just because it was >>>>>>>>> leaking from >>>>>>>>> somewhere else in Beam, we should better check this first. >>>>>>>>> >>>> >>>>>>>>> >>>> > These modules expose protobuf because it is part of the API >>>>>>>>> surface: >>>>>>>>> >>>> > * sdks/java/extensions/protobuf >>>>>>>>> >>>> > * sdks/java/io/google-cloud-platform (I believe that gRPC >>>>>>>>> could be shaded here but preferrably the IO module would do it so we >>>>>>>>> wouldn't have this maintenance burden.) >>>>>>>>> >>>> >>>>>>>>> >>>> Can you please elaborate on ‘but preferrably the IO module >>>>>>>>> would do it so we wouldn't have this maintenance burden’. I remember >>>>>>>>> there >>>>>>>>> was an issue when running the examples in the spark runner examples >>>>>>>>> because >>>>>>>>> of sdks/java/io/google-cloud-platform leaking netty via gRPC >>>>>>>>> (BEAM-3519) >>>>>>>>> [Note that this is hidden at this moment because of pure luck Spark >>>>>>>>> 2.3.x >>>>>>>>> and Beam are aligned on netty version but this can change in the >>>>>>>>> future so >>>>>>>>> hopefully this can be shaded/controlled]. >>>>>>>>> >>>> >>>>>>>>> >>>> On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud < >>>>>>>>> apill...@google.com> wrote: >>>>>>>>> >>>>> >>>>>>>>> >>>>> This is really cool and should cut down our artifact size >>>>>>>>> significantly! Thanks Luke! >>>>>>>>> >>>>> >>>>>>>>> >>>>> I am running into one issue after this: builds with the >>>>>>>>> publishing flag no longer work. (We run './gradlew -Ppublishing >>>>>>>>> shadowJar' >>>>>>>>> to generate release artifacts for the Beam SQL shell.) I get a bunch >>>>>>>>> of >>>>>>>>> errors like this: >>>>>>>>> >>>>> >>>>>>>>> >>>>> >>>>>>>>> model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148: >>>>>>>>> error: no suitable method found for >>>>>>>>> readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser<Pipeline>,ExtensionRegistryLite) >>>>>>>>> >>>>> >>>>>>>>> >>>>> Is there something I need to change in my build? >>>>>>>>> >>>>> >>>>>>>>> >>>>> Andrew >>>>>>>>> >>>>> >>>>>>>>> >>>>> On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik < >>>>>>>>> lc...@google.com> wrote: >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> With the merge of PR #5594[1], we started shading all gRPC >>>>>>>>> / Protobuf dependencies within all the modules that depended on the >>>>>>>>> model/* >>>>>>>>> dependencies by vendoring them. The vendored versions are built and >>>>>>>>> packaged into the model jars (they should be separated out once I >>>>>>>>> figure >>>>>>>>> out how to generate proto code using a shaded import path). Note that >>>>>>>>> this >>>>>>>>> cleaned up several issues where we were incorrectly built shaded jars >>>>>>>>> without repackaging in some locations or the shading process was >>>>>>>>> corrupting >>>>>>>>> the contents of some of the jars. >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Note that the majority of the code base (especially related >>>>>>>>> to portability) should be using imports under the >>>>>>>>> org.apache.beam.vendor.protobuf.v3 or org.apache.beam.vendor.grpc.v1 >>>>>>>>> paths. >>>>>>>>> I have yet to figure out a clean way to get Intellij to recognize >>>>>>>>> these >>>>>>>>> vendored paths. My only solution so far has been to manually add one >>>>>>>>> of the >>>>>>>>> built model jars to the compile classpath of the module being worked >>>>>>>>> on in >>>>>>>>> Intellij as described here[2]. I would greatly appreciate some ideas >>>>>>>>> on how >>>>>>>>> to improve this integration because from a few attempts configuring >>>>>>>>> the >>>>>>>>> intellij gradle pluglin scope sections didn't produce the result that >>>>>>>>> I was >>>>>>>>> expecting. >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> I also added a simple test task >>>>>>>>> validateShadedJarDoesntLeakNonOrgApacheBeamClasses that validates the >>>>>>>>> shaded jar doesn't contain classes without repackaging which we should >>>>>>>>> apply to any module that performs shading to ensure that classes are >>>>>>>>> relocated and we don't accidentally expose stuff. I filed >>>>>>>>> BEAM-4753[3] to >>>>>>>>> this end. >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Note, these are the following modules that still depend on >>>>>>>>> protobuf that are shaded away and could move to use a vendored >>>>>>>>> variant of >>>>>>>>> protobuf: >>>>>>>>> >>>>>> * sdks/java/core >>>>>>>>> >>>>>> * sdks/java/extensions/sql >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> These modules expose protobuf because it is part of the API >>>>>>>>> surface: >>>>>>>>> >>>>>> * sdks/java/extensions/protobuf >>>>>>>>> >>>>>> * sdks/java/io/google-cloud-platform (I believe that gRPC >>>>>>>>> could be shaded here but preferrably the IO module would do it so we >>>>>>>>> wouldn't have this maintenance burden.) >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> 1: https://github.com/apache/beam/pull/5594 >>>>>>>>> >>>>>> 2: >>>>>>>>> https://stackoverflow.com/questions/1051640/correct-way-to-add-external-jars-lib-jar-to-an-intellij-idea-project >>>>>>>>> >>>>>> 3: https://issues.apache.org/jira/browse/BEAM-4753 >>>>>>>>> >>>>>>>>