Thanks for fixing the duplicate root issue, PR is merged. I cannot run RemoteExecutionTest from the same commit (Intellij v2018.1.6). Here are some of the class loading errors before I give up adding dependencies manually :(
java.lang.NoClassDefFoundError: org/apache/beam/vendor/grpc/v1/io/grpc/BindableService java.lang.NoClassDefFoundError: com/google/protobuf/ByteString java.lang.NoClassDefFoundError: org/apache/beam/vendor/sdk/v2/sdk/extensions/protobuf/ByteStringCoder Caused by: java.lang.ClassNotFoundException: net.bytebuddy.NamingStrategy Our Intellij results are all over the board, possibly due to different versions or other settings/plugins? It will be important to sort it out to make the contributor experience smoother. Thanks, Thomas On Wed, Jul 18, 2018 at 9:29 AM Lukasz Cwik <lc...@google.com> wrote: > Ismael, the SDK should perform the pipeline translation to proto because I > expect the flow to be: > User Code -> SDK -> Proto Translation -> Job API -> Runner > I don't expect "runners" to live within the users process anymore > (excluding the direct runner). There will be one portable "runner" and it > will be responsible for communicating with the job management APIs. It > shouldn't be called a runner but for backwards compatibility it will behave > like a runner does today. Flink/Spark/... will all live on the other side > of the job management API. > > Thomas, I can run RemoteExectuionTest from commit > ae2bebaf8b277e99840fa63f1b95d828f2093d16 without needing to modify the > project/module structure in Intellij. Adding the jars manually only helps > with code completion. > > https://github.com/apache/beam/pull/5977 works around the duplicate > content root issue in Intellij. I have also run into the -Werror issue > occasionally and don't know any fix or why it gets triggered as it doesn't > happen to me all the time. > > On Tue, Jul 17, 2018 at 7:01 PM Thomas Weise <t...@apache.org> wrote: > >> Thanks, the classpath order matters indeed. >> >> Still not able to run RemoteExecutionTest, but I was able to get the >> Flink portable test to work by adding the following to the *top* of the >> dependency list of *beam-runners-flink_2.11_test* >> >> >> vendor/sdks-java-extensions-protobuf/build/libs/beam-vendor-sdks-java-extensions-protobuf-2.6.0-SNAPSHOT.jar >> model/fn-execution/build/libs/beam-model-fn-execution-2.6.0-SNAPSHOT.jar >> >> >> On Tue, Jul 17, 2018 at 6:00 PM Ankur Goenka <goe...@google.com> wrote: >> >>> Yes, I am able to run it. >>> >>> For tests, you also need to add dependencies to >>> ":beam-runners-java-fn-execution/beam-runners-java-fn-execution_*test*" >>> module. >>> >>> Also, I only added >>> :beam-model-job-management-2.6.0-SNAPSHOT.jar >>> :beam-model-fn-execution-2.6.0-SNAPSHOT.jar >>> to the dependencies manually so not sure if you want to add >>> io.grpc:grpc-core:1.12.0 and com.google.protobuf:protobuf-java:3.5.1 to >>> the dependencies. >>> >>> Note, you need to move them up in the dependencies list. >>> >>> >>> On Tue, Jul 17, 2018 at 5:54 PM Thomas Weise <t...@apache.org> wrote: >>> >>>> Are you able to >>>> run org.apache.beam.runners.fnexecution.control.RemoteExecutionTest from >>>> within Intellij ? >>>> >>>> I can get the compile errors to disappear by adding >>>> beam-model-job-management-2.6.0-SNAPSHOT.jar, io.grpc:grpc-core:1.12.0 >>>> and com.google.protobuf:protobuf-java:3.5.1 >>>> >>>> Running the test still fails since other dependencies are missing. >>>> >>>> >>>> On Tue, Jul 17, 2018 at 4:02 PM Ankur Goenka <goe...@google.com> wrote: >>>> >>>>> For reference: >>>>> I was able to make intellij work with the master by doing following >>>>> steps >>>>> >>>>> 1. Remove module :beam:vendor-sdks-java-extensions-protobuf from >>>>> intellij. >>>>> 2. Adding >>>>> >>>>> :beam-model-fn-execution/build/libs/beam-model-fn-execution-2.6.0-SNAPSHOT.jar >>>>> and >>>>> :beam-model-job-management/build/libs/beam-model-job-management-2.6.0-SNAPSHOT.jar >>>>> to the appropriate modules at the top of the dependency list. >>>>> >>>>> >>>>> On Tue, Jul 17, 2018 at 2:29 PM Thomas Weise <t...@apache.org> wrote: >>>>> >>>>>> Adding the external jar in Intellij (2018.1) currently fails due to a >>>>>> duplicate source directory (sdks/java/extensions/protobuf/src/main/java). >>>>>> >>>>>> The build as such also fails, with: error: warnings found and >>>>>> -Werror specified >>>>>> >>>>>> Ismaël found removing >>>>>> https://github.com/apache/beam/blob/master/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L538 >>>>>> as workaround. >>>>>> >>>>>> >>>>>> On Thu, Jul 12, 2018 at 1:55 PM Ismaël Mejía <ieme...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Seems reasonable, but why exactly may we need the model (or protobuf >>>>>>> related things) in the future in the SDK ? wasn’t it supposed to be >>>>>>> translated into the Pipeline proto representation via the runners >>>>>>> (and >>>>>>> in this case the dep reside in the runner side) ? >>>>>>> On Thu, Jul 12, 2018 at 2:50 AM Lukasz Cwik <lc...@google.com> >>>>>>> wrote: >>>>>>> > >>>>>>> > Got a fix[1] for Andrews issue which turned out to be a release >>>>>>> blocker since it broke performing the release. Also fixed several minor >>>>>>> things like javadoc that were wrong with the release. Solving it >>>>>>> allowed me >>>>>>> to do the publishing in parallel and cut the release time from 20+ mins >>>>>>> to >>>>>>> 8 mins on my machine. >>>>>>> > >>>>>>> > 1: https://github.com/apache/beam/pull/5936 >>>>>>> > >>>>>>> > On Wed, Jul 11, 2018 at 3:51 PM Andrew Pilloud < >>>>>>> apill...@google.com> wrote: >>>>>>> >> >>>>>>> >> We discussed this in person, sounds like my issue is known and >>>>>>> will be fixed shortly. I'm running builds with '-Ppublishing' because I >>>>>>> need to generate release artifacts for bundling the Beam SQL shell with >>>>>>> the >>>>>>> Google Cloud SDK. Hope to eventually just use the Beam release, but we >>>>>>> are >>>>>>> currently cutting a release off master every week to quickly iterate on >>>>>>> bug >>>>>>> fixes. >>>>>>> >> >>>>>>> >> Andrew >>>>>>> >> >>>>>>> >> On Wed, Jul 11, 2018 at 1:39 PM Lukasz Cwik <lc...@google.com> >>>>>>> wrote: >>>>>>> >>> >>>>>>> >>> Andrew, to my knowledge it seems as though your running into >>>>>>> BEAM-4744, is there a reason you need to specify -Ppublishing? >>>>>>> >>> >>>>>>> >>> No particular reason to using ByteString within ByteKey and >>>>>>> TextSource. Note that we currently do shade away protobuf in >>>>>>> sdks/java/core >>>>>>> so we could either migrate to using a vendored version or re-implement >>>>>>> the >>>>>>> functionality to not use ByteString. Note that sdks/java/core can now >>>>>>> dependend on the model/* classes and perform the Pipeline -> Proto >>>>>>> translation as this will be needed to support portability efforts so I >>>>>>> would prefer just migrating to use the vendored versions of the code. >>>>>>> Filed >>>>>>> BEAM-4766. >>>>>>> >>> >>>>>>> >>> As for the IO module, I was referring to the upstream >>>>>>> bigtable/bigquery/... libraries vended by Google. If they trimmed their >>>>>>> API >>>>>>> surface to not expose gRPC or protobuf, then we wouldn't have to worry >>>>>>> about having the shading logic within >>>>>>> sdks/java/io/google-cloud-platform. I >>>>>>> know that this will be impossible for some connectors without backwards >>>>>>> incompatible changes since they exposed protobuf on their API surface. I >>>>>>> know that Chamikara was looking to shade this away in the >>>>>>> sdks/java/io/google-cloud-platform but only had limited success in the >>>>>>> past. >>>>>>> >>> >>>>>>> >>> On Wed, Jul 11, 2018 at 1:14 PM Ismaël Mejía <ieme...@gmail.com> >>>>>>> wrote: >>>>>>> >>>> >>>>>>> >>>> This is great news in particular for runners (Spark) where the >>>>>>> leaking of some grpc subdependencies caused stability issues and >>>>>>> required >>>>>>> extra shading. Great ! >>>>>>> >>>> >>>>>>> >>>> About the other modules >>>>>>> >>>> >>>>>>> >>>> > Note, these are the following modules that still depend on >>>>>>> protobuf that are shaded away and could move to use a vendored variant >>>>>>> of >>>>>>> protobuf: >>>>>>> >>>> > * sdks/java/core >>>>>>> >>>> > * sdks/java/extensions/sql >>>>>>> >>>> >>>>>>> >>>> For sdks/java/core the dependency in protobuf seems to be >>>>>>> minor, from a quick look it seems that it is only used to import >>>>>>> ByteString >>>>>>> in two classes: ByteKey and TextSource so hopefully we can rewrite both >>>>>>> and >>>>>>> get rid of the dependency altogether (making core smaller which is >>>>>>> always a >>>>>>> win). >>>>>>> >>>> Can we fill a JIRA for this or do I miss other reasons to >>>>>>> depend on protobuf in core? >>>>>>> >>>> >>>>>>> >>>> For sdks/java/extensions/sql I don’t know if I am missing >>>>>>> something, but I don’t see any code use of protobuf and I doubt that >>>>>>> calcite uses protobuf so maybe it is there just because it was leaking >>>>>>> from >>>>>>> somewhere else in Beam, we should better check this first. >>>>>>> >>>> >>>>>>> >>>> > These modules expose protobuf because it is part of the API >>>>>>> surface: >>>>>>> >>>> > * sdks/java/extensions/protobuf >>>>>>> >>>> > * sdks/java/io/google-cloud-platform (I believe that gRPC >>>>>>> could be shaded here but preferrably the IO module would do it so we >>>>>>> wouldn't have this maintenance burden.) >>>>>>> >>>> >>>>>>> >>>> Can you please elaborate on ‘but preferrably the IO module >>>>>>> would do it so we wouldn't have this maintenance burden’. I remember >>>>>>> there >>>>>>> was an issue when running the examples in the spark runner examples >>>>>>> because >>>>>>> of sdks/java/io/google-cloud-platform leaking netty via gRPC (BEAM-3519) >>>>>>> [Note that this is hidden at this moment because of pure luck Spark >>>>>>> 2.3.x >>>>>>> and Beam are aligned on netty version but this can change in the future >>>>>>> so >>>>>>> hopefully this can be shaded/controlled]. >>>>>>> >>>> >>>>>>> >>>> On Wed, Jul 11, 2018 at 8:55 PM Andrew Pilloud < >>>>>>> apill...@google.com> wrote: >>>>>>> >>>>> >>>>>>> >>>>> This is really cool and should cut down our artifact size >>>>>>> significantly! Thanks Luke! >>>>>>> >>>>> >>>>>>> >>>>> I am running into one issue after this: builds with the >>>>>>> publishing flag no longer work. (We run './gradlew -Ppublishing >>>>>>> shadowJar' >>>>>>> to generate release artifacts for the Beam SQL shell.) I get a bunch of >>>>>>> errors like this: >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> model/job-management/build/generated/source/proto/main/java/org/apache/beam/model/jobmanagement/v1/JobApi.java:148: >>>>>>> error: no suitable method found for >>>>>>> readMessage(org.apache.beam.vendor.protobuf.v3.com.google.protobuf.Parser<Pipeline>,ExtensionRegistryLite) >>>>>>> >>>>> >>>>>>> >>>>> Is there something I need to change in my build? >>>>>>> >>>>> >>>>>>> >>>>> Andrew >>>>>>> >>>>> >>>>>>> >>>>> On Tue, Jul 10, 2018 at 2:10 PM Lukasz Cwik <lc...@google.com> >>>>>>> wrote: >>>>>>> >>>>>> >>>>>>> >>>>>> With the merge of PR #5594[1], we started shading all gRPC / >>>>>>> Protobuf dependencies within all the modules that depended on the >>>>>>> model/* >>>>>>> dependencies by vendoring them. The vendored versions are built and >>>>>>> packaged into the model jars (they should be separated out once I figure >>>>>>> out how to generate proto code using a shaded import path). Note that >>>>>>> this >>>>>>> cleaned up several issues where we were incorrectly built shaded jars >>>>>>> without repackaging in some locations or the shading process was >>>>>>> corrupting >>>>>>> the contents of some of the jars. >>>>>>> >>>>>> >>>>>>> >>>>>> Note that the majority of the code base (especially related >>>>>>> to portability) should be using imports under the >>>>>>> org.apache.beam.vendor.protobuf.v3 or org.apache.beam.vendor.grpc.v1 >>>>>>> paths. >>>>>>> I have yet to figure out a clean way to get Intellij to recognize these >>>>>>> vendored paths. My only solution so far has been to manually add one of >>>>>>> the >>>>>>> built model jars to the compile classpath of the module being worked on >>>>>>> in >>>>>>> Intellij as described here[2]. I would greatly appreciate some ideas on >>>>>>> how >>>>>>> to improve this integration because from a few attempts configuring the >>>>>>> intellij gradle pluglin scope sections didn't produce the result that I >>>>>>> was >>>>>>> expecting. >>>>>>> >>>>>> >>>>>>> >>>>>> I also added a simple test task >>>>>>> validateShadedJarDoesntLeakNonOrgApacheBeamClasses that validates the >>>>>>> shaded jar doesn't contain classes without repackaging which we should >>>>>>> apply to any module that performs shading to ensure that classes are >>>>>>> relocated and we don't accidentally expose stuff. I filed BEAM-4753[3] >>>>>>> to >>>>>>> this end. >>>>>>> >>>>>> >>>>>>> >>>>>> Note, these are the following modules that still depend on >>>>>>> protobuf that are shaded away and could move to use a vendored variant >>>>>>> of >>>>>>> protobuf: >>>>>>> >>>>>> * sdks/java/core >>>>>>> >>>>>> * sdks/java/extensions/sql >>>>>>> >>>>>> >>>>>>> >>>>>> These modules expose protobuf because it is part of the API >>>>>>> surface: >>>>>>> >>>>>> * sdks/java/extensions/protobuf >>>>>>> >>>>>> * sdks/java/io/google-cloud-platform (I believe that gRPC >>>>>>> could be shaded here but preferrably the IO module would do it so we >>>>>>> wouldn't have this maintenance burden.) >>>>>>> >>>>>> >>>>>>> >>>>>> 1: https://github.com/apache/beam/pull/5594 >>>>>>> >>>>>> 2: >>>>>>> https://stackoverflow.com/questions/1051640/correct-way-to-add-external-jars-lib-jar-to-an-intellij-idea-project >>>>>>> >>>>>> 3: https://issues.apache.org/jira/browse/BEAM-4753 >>>>>>> >>>>>>