2019년 4월 23일 (화) 오전 2:07, Robert Bradshaw <rober...@google.com>님이 작성:
> I've been out, so coming a bit late to the discussion, but here's my > thoughts. > > The expansion service absolutely needs to be able to provide the > dependencies for the transform(s) it expands. It seems the default, > foolproof way of doing this is via the environment, which can be a > docker image with all the required dependencies. More than this an > (arguably important, but possibly messy) optimization. > > The standard way to provide artifacts outside of the environment is > via the artifact staging service. Of course, the expansion service may > not have access to the (final) artifact staging service (due to > permissions, locality, or it may not even be started up yet) but the > SDK invoking the expansion service could offer an artifact staging > environment for the SDK to publish artifacts to. However, there are > some difficulties here, in particular avoiding name collisions with > staged artifacts, assigning semantic meaning to the artifacts (e.g. > should jar files get automatically placed in the classpath, or Python > packages recognized and installed at startup). The alternative is > going with a (type, pointer) scheme for naming dependencies; if we go > this route I think we should consider migrating all artifact staging > to this style. I am concerned that the "file" version will be less > than useful for what will become the most convenient expansion > services (namely, hosted and docker image). I am still at a loss, > however, as to how to solve the diamond dependency problem among > dependencies--perhaps the information is there if one walks > maven/pypi/go modules/... but do we expect every runner to know about > every packaging platform? This also wouldn't solve the issue if fat > jars are used as dependencies. The only safe thing to do here is to > force distinct dependency sets to live in different environments, > which could be too conservative. > > This all leads me to think that perhaps the environment itself should > be docker image (often one of "vanilla" beam-java-x.y ones) + > dependency list, rather than have the dependency/artifact list as some > kind of data off to the side. In this case, the runner would (as > requested by its configuration) be free to merge environments it > deemed compatible, including swapping out beam-java-X for > beam-java-embedded if it considers itself compatible with the > dependency list. Like this idea to build multiple docker environments on top of a bare minimum SDK harness container and allow runners to pick a suitable one based on a dependency list. > > I agree with Thomas that we'll want to make expansion services, and > the transforms they offer, more discoverable. The whole lifetime cycle > of expansion services is something that has yet to be fully fleshed > out, and may influence some of these decisions. > > As for adding --jar_package to the Python SDK, this seems really > specific to calling java-from-python (would we have O(n^2) such > options?) as well as out-of-place for a Python user to specify. I > would really hope we can figure out a more generic solution. If we > need this option in the meantime, let's at least make it clear > (probably in the name) that it's temporary. > Good points. I second that we need a more generic solution than python-to-java specific option. I think instead of naming differently we can make --jar_package a secondary option under --experiment in the meantime. WDYT? > On Tue, Apr 23, 2019 at 1:08 AM Thomas Weise <t...@apache.org> wrote: > > > > One more suggestion: > > > > It would be nice to be able to select the environment for the external > transforms. For example, I would like to be able to use EMBEDDED for Flink. > That's implicit for sources which are runner native unbounded read > translations, but it should also be possible for writes. That would then be > similar to how pipelines are packaged and run with the "legacy" runner. > > > > Thomas > > > > > > On Mon, Apr 22, 2019 at 1:18 PM Ankur Goenka <goe...@google.com> wrote: > >> > >> Great discussion! > >> I have a few points around the structure of proto but that is less > important as it can evolve. > >> However, I think that artifact compatibility is another important > aspect to look at. > >> Example: TransformA uses Guava 1.6>< 1.7, TransformB uses 1.8><1.9 and > TransformC uses 1.6><1.8. As sdk provide the environment for each > transform, it can not simply say EnvironmentJava for both TransformA and > TransformB as the dependencies are not compatible. > >> We should have separate environment associated with TransformA and > TransformB in this case. > >> > >> To support this case, we need 2 things. > >> 1: Granular metadata about the dependency including type. > >> 2: Complete list of the transforms to be expanded. > >> > >> Elaboration: > >> The compatibility check can be done in a crude way if we provide all > the metadata about the dependency to expansion service. > >> Also, the expansion service should expand all the applicable transforms > in a single call so that it knows about incompatibility and create separate > environments for these transforms. So in the above example, expansion > service will associate EnvA to TransformA and EnvB to TransformB and EnvA > to TransformC. This will ofcource require changes to Expansion service > proto but giving all the information to expansion service will make it > support more case and make it a bit more future proof. > >> > >> > >> On Mon, Apr 22, 2019 at 10:16 AM Maximilian Michels <m...@apache.org> > wrote: > >>> > >>> Thanks for the summary Cham. All makes sense. I agree that we want to > >>> keep the option to manually specify artifacts. > >>> > >>> > There are few unanswered questions though. > >>> > (1) In what form will a transform author specify dependencies ? For > example, URL to a Maven repo, URL to a local file, blob ? > >>> > >>> Going forward, we probably want to support multiple ways. For now, we > >>> could stick with a URL-based approach with support for different file > >>> systems. In the future a list of packages to retrieve from Maven/PyPi > >>> would be useful. > >>> > >> We can ask user for (type, metadata). For maven it can be something > like (MAVEN, {groupId:com.google.guava, artifactId: guava, version: 19}) or > (FILE, file://myfile) > >> To begin with, we can only support a few types like File and can add > more types in future. > >>> > >>> > (2) How will dependencies be included in the expansion response > proto ? String (URL), bytes (blob) ? > >>> > >>> I'd go for a list of Protobuf strings first but the format would have > to > >>> evolve for other dependency types. > >>> > >> Here also (type, payload) should suffice. We can have interpreter for > each type to translate the payload. > >>> > >>> > (3) How will we manage/share transitive dependencies required at > runtime ? > >>> > >>> I'd say transitive dependencies have to be included in the list. In > case > >>> of fat jars, they are reduced to a single jar. > >> > >> Makes sense. > >>> > >>> > >>> > (4) How will dependencies be staged for various runner/SDK > combinations ? (for example, portable runner/Flink, Dataflow runner) > >>> > >>> Staging should be no different than it is now, i.e. go through Beam's > >>> artifact staging service. As long as the protocol is stable, there > could > >>> also be different implementations. > >> > >> Makes sense. > >>> > >>> > >>> -Max > >>> > >>> On 20.04.19 03:08, Chamikara Jayalath wrote: > >>> > OK, sounds like this is a good path forward then. > >>> > > >>> > * When starting up the expansion service, user (that starts up the > >>> > service) provide dependencies necessary to expand transforms. We will > >>> > later add support for adding new transforms to an already running > >>> > expansion service. > >>> > * As a part of transform configuration, transform author have the > option > >>> > of providing a list of dependencies that will be needed to run the > >>> > transform. > >>> > * These dependencies will be send back to the pipeline SDK as a part > of > >>> > expansion response and pipeline SDK will stage these resources. > >>> > * Pipeline author have the option of specifying the dependencies > using a > >>> > pipeline option. (for example, > https://github.com/apache/beam/pull/8340) > >>> > > >>> > I think last option is important to (1) make existing transform > easily > >>> > available for cross-language usage without additional configurations > (2) > >>> > allow pipeline authors to override dependency versions specified by > in > >>> > the transform configuration (for example, to apply security patches) > >>> > without updating the expansion service. > >>> > > >>> > There are few unanswered questions though. > >>> > (1) In what form will a transform author specify dependencies ? For > >>> > example, URL to a Maven repo, URL to a local file, blob ? > >>> > (2) How will dependencies be included in the expansion response > proto ? > >>> > String (URL), bytes (blob) ? > >>> > (3) How will we manage/share transitive dependencies required at > runtime ? > >>> > (4) How will dependencies be staged for various runner/SDK > combinations > >>> > ? (for example, portable runner/Flink, Dataflow runner) > >>> > > >>> > Thanks, > >>> > Cham > >>> > > >>> > On Fri, Apr 19, 2019 at 4:49 AM Maximilian Michels <m...@apache.org > >>> > <mailto:m...@apache.org>> wrote: > >>> > > >>> > Thank you for your replies. > >>> > > >>> > I did not suggest that the Expansion Service does the staging, > but it > >>> > would return the required resources (e.g. jars) for the external > >>> > transform's runtime environment. The client then has to take > care of > >>> > staging the resources. > >>> > > >>> > The Expansion Service itself also needs resources to do the > >>> > expansion. I > >>> > assumed those to be provided when starting the expansion > service. I > >>> > consider it less important but we could also provide a way to > add new > >>> > transforms to the Expansion Service after startup. > >>> > > >>> > Good point on Docker vs externally provided environments. For > the PR > >>> > [1] > >>> > it will suffice then to add Kafka to the container dependencies. > The > >>> > "--jar_package" pipeline option is ok for now but I'd like to > see work > >>> > towards staging resources for external transforms via information > >>> > returned by the Expansion Service. That avoids users having to > take > >>> > care > >>> > of including the correct jars in their pipeline options. > >>> > > >>> > These issues are related and we could discuss them in separate > threads: > >>> > > >>> > * Auto-discovery of Expansion Service and its external transforms > >>> > * Credentials required during expansion / runtime > >>> > > >>> > Thanks, > >>> > Max > >>> > > >>> > [1] ttps://github.com/apache/beam/pull/8322 > >>> > <http://github.com/apache/beam/pull/8322> > >>> > > >>> > On 19.04.19 07:35, Thomas Weise wrote: > >>> > > Good discussion :) > >>> > > > >>> > > Initially the expansion service was considered a user > >>> > responsibility, > >>> > > but I think that isn't necessarily the case. I can also see > the > >>> > > expansion service provided as part of the infrastructure and > the > >>> > user > >>> > > not wanting to deal with it at all. For example, users may > want > >>> > to write > >>> > > Python transforms and use external IOs, without being > concerned how > >>> > > these IOs are provided. Under such scenario it would be good > if: > >>> > > > >>> > > * Expansion service(s) can be auto-discovered via the job > service > >>> > endpoint > >>> > > * Available external transforms can be discovered via the > expansion > >>> > > service(s) > >>> > > * Dependencies for external transforms are part of the > metadata > >>> > returned > >>> > > by expansion service > >>> > > > >>> > > Dependencies could then be staged either by the SDK client or > the > >>> > > expansion service. The expansion service could provide the > >>> > locations to > >>> > > stage to the SDK, it would still be transparent to the user. > >>> > > > >>> > > I also agree with Luke regarding the environments. Docker is > the > >>> > choice > >>> > > for generic deployment. Other environments are used when the > >>> > flexibility > >>> > > offered by Docker isn't needed (or gets into the way). Then > the > >>> > > dependencies are provided in different ways. Whether these are > >>> > Python > >>> > > packages or jar files, by opting out of Docker the decision is > >>> > made to > >>> > > manage dependencies externally. > >>> > > > >>> > > Thomas > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 6:01 PM Chamikara Jayalath > >>> > <chamik...@google.com <mailto:chamik...@google.com> > >>> > > <mailto:chamik...@google.com <mailto:chamik...@google.com>>> > wrote: > >>> > > > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 5:21 PM Chamikara Jayalath > >>> > > <chamik...@google.com <mailto:chamik...@google.com> > >>> > <mailto:chamik...@google.com <mailto:chamik...@google.com>>> > wrote: > >>> > > > >>> > > Thanks for raising the concern about credentials > Ankur, I > >>> > agree > >>> > > that this is a significant issue. > >>> > > > >>> > > On Thu, Apr 18, 2019 at 4:23 PM Lukasz Cwik > >>> > <lc...@google.com <mailto:lc...@google.com> > >>> > > <mailto:lc...@google.com <mailto:lc...@google.com>>> > wrote: > >>> > > > >>> > > I can understand the concern about credentials, > the same > >>> > > access concern will exist for several cross > language > >>> > > transforms (mostly IOs) since some will need > access to > >>> > > credentials to read/write to an external service. > >>> > > > >>> > > Are there any ideas on how credential propagation > >>> > could work > >>> > > to these IOs? > >>> > > > >>> > > > >>> > > There are some cases where existing IO transforms need > >>> > > credentials to access remote resources, for example, > size > >>> > > estimation, validation, etc. But usually these are > >>> > optional (or > >>> > > transform can be configured to not perform these > functions). > >>> > > > >>> > > > >>> > > To clarify, I'm only talking about transform expansion > here. > >>> > Many IO > >>> > > transforms need read/write access to remote services at > run > >>> > time. So > >>> > > probably we need to figure out a way to propagate these > >>> > credentials > >>> > > anyways. > >>> > > > >>> > > Can we use these mechanisms for staging? > >>> > > > >>> > > > >>> > > I think we'll have to find a way to do one of (1) > propagate > >>> > > credentials to other SDKs (2) allow users to > configure SDK > >>> > > containers to have necessary credentials (3) do the > artifact > >>> > > staging from the pipeline SDK environment which > already have > >>> > > credentials. I prefer (1) or (2) since this will > given a > >>> > > transform same feature set whether used directly (in > the same > >>> > > SDK language as the transform) or remotely but it > might > >>> > be hard > >>> > > to do this for an arbitrary service that a transform > might > >>> > > connect to considering the number of ways users can > configure > >>> > > credentials (after an offline discussion with Ankur). > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 3:47 PM Ankur Goenka > >>> > > <goe...@google.com <mailto:goe...@google.com> > >>> > <mailto:goe...@google.com <mailto:goe...@google.com>>> wrote: > >>> > > > >>> > > I agree that the Expansion service knows > about the > >>> > > artifacts required for a cross language > transform and > >>> > > having a prepackage folder/Zip for transforms > >>> > based on > >>> > > language makes sense. > >>> > > > >>> > > One think to note here is that expansion > service > >>> > might > >>> > > not have the same access privilege as the > pipeline > >>> > > author and hence might not be able to stage > >>> > artifacts by > >>> > > itself. > >>> > > Keeping this in mind I am leaning towards > making > >>> > > Expansion service provide all the required > >>> > artifacts to > >>> > > the user and let the user stage the artifacts > as > >>> > regular > >>> > > artifacts. > >>> > > At this time, we only have Beam File System > based > >>> > > artifact staging which users local > credentials to > >>> > access > >>> > > different file systems. Even a docker based > expansion > >>> > > service running on local machine might not > have > >>> > the same > >>> > > access privileges. > >>> > > > >>> > > In brief this is what I am leaning toward. > >>> > > User call for pipeline submission -> Expansion > >>> > service > >>> > > provide cross language transforms and relevant > >>> > artifacts > >>> > > to the Sdk -> Sdk Submits the pipeline to > >>> > Jobserver and > >>> > > Stages user and cross language artifacts to > artifacts > >>> > > staging service > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 2:33 PM Chamikara > Jayalath > >>> > > <chamik...@google.com > >>> > <mailto:chamik...@google.com> <mailto:chamik...@google.com > >>> > <mailto:chamik...@google.com>>> wrote: > >>> > > > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 2:12 PM Lukasz > Cwik > >>> > > <lc...@google.com <mailto: > lc...@google.com> > >>> > <mailto:lc...@google.com <mailto:lc...@google.com>>> wrote: > >>> > > > >>> > > Note that Max did ask whether making > the > >>> > > expansion service do the staging made > >>> > sense, and > >>> > > my first line was agreeing with that > >>> > direction > >>> > > and expanding on how it could be done > (so > >>> > this > >>> > > is really Max's idea or from whomever > he > >>> > got the > >>> > > idea from). > >>> > > > >>> > > > >>> > > +1 to what Max said then :) > >>> > > > >>> > > > >>> > > I believe a lot of the value of the > expansion > >>> > > service is not having users need to be > >>> > aware of > >>> > > all the SDK specific dependencies when > >>> > they are > >>> > > trying to create a pipeline, only the > >>> > "user" who > >>> > > is launching the expansion service may > >>> > need to. > >>> > > And in that case we can have a > prepackaged > >>> > > expansion service application that > does what > >>> > > most users would want (e.g. expansion > >>> > service as > >>> > > a docker container, a single bundled > jar, > >>> > ...). > >>> > > We (the Apache Beam community) could > >>> > choose to > >>> > > host a default implementation of the > >>> > expansion > >>> > > service as well. > >>> > > > >>> > > > >>> > > I'm not against this. But I think this is > a > >>> > > secondary more advanced use-case. For a > Beam > >>> > users > >>> > > that needs to use a Java transform that > they > >>> > already > >>> > > have in a Python pipeline, we should > provide > >>> > a way > >>> > > to allow starting up a expansion service > (with > >>> > > dependencies needed for that) and running > a > >>> > pipeline > >>> > > that uses this external Java transform > (with > >>> > > dependencies that are needed at runtime). > >>> > Probably, > >>> > > it'll be enough to allow providing all > >>> > dependencies > >>> > > when starting up the expansion service > and allow > >>> > > expansion service to do the staging of > jars are > >>> > > well. I don't see a need to include the > list > >>> > of jars > >>> > > in the ExpansionResponse sent to the > Python SDK. > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 2:02 PM > Chamikara > >>> > > Jayalath <chamik...@google.com > >>> > <mailto:chamik...@google.com> > >>> > > <mailto:chamik...@google.com > >>> > <mailto:chamik...@google.com>>> wrote: > >>> > > > >>> > > I think there are two kind of > >>> > dependencies > >>> > > we have to consider. > >>> > > > >>> > > (1) Dependencies that are needed > to > >>> > expand > >>> > > the transform. > >>> > > > >>> > > These have to be provided when we > >>> > start the > >>> > > expansion service so that > available > >>> > external > >>> > > transforms are correctly > registered > >>> > with the > >>> > > expansion service. > >>> > > > >>> > > (2) Dependencies that are not > needed at > >>> > > expansion but may be needed at > runtime. > >>> > > > >>> > > I think in both cases, users have > to > >>> > provide > >>> > > these dependencies either when > expansion > >>> > > service is started or when a > pipeline is > >>> > > being executed. > >>> > > > >>> > > Max, I'm not sure why expansion > >>> > service will > >>> > > need to provide dependencies to > the user > >>> > > since user will already be aware > of > >>> > these. > >>> > > Are you talking about a expansion > service > >>> > > that is readily available that > will > >>> > be used > >>> > > by many Beam users ? I think such > a > >>> > > (possibly long running) service > will > >>> > have to > >>> > > maintain a repository of > transforms and > >>> > > should have mechanism for > registering new > >>> > > transforms and discovering already > >>> > > registered transforms etc. I think > >>> > there's > >>> > > more design work needed to make > transform > >>> > > expansion service support such > use-cases. > >>> > > Currently, I think allowing > pipeline > >>> > author > >>> > > to provide the jars when starting > the > >>> > > expansion service and when > executing the > >>> > > pipeline will be adequate. > >>> > > > >>> > > Regarding the entity that will > >>> > perform the > >>> > > staging, I like Luke's idea of > allowing > >>> > > expansion service to do the > staging > >>> > (of jars > >>> > > provided by the user). Notion of > >>> > artifacts > >>> > > and how they are > extracted/represented is > >>> > > SDK dependent. So if the pipeline > SDK > >>> > tries > >>> > > to do this we have to add n x (n > -1) > >>> > > configurations (for n SDKs). > >>> > > > >>> > > - Cham > >>> > > > >>> > > On Thu, Apr 18, 2019 at 11:45 AM > >>> > Lukasz Cwik > >>> > > <lc...@google.com > >>> > <mailto:lc...@google.com> <mailto:lc...@google.com > >>> > <mailto:lc...@google.com>>> > >>> > > wrote: > >>> > > > >>> > > We can expose the artifact > staging > >>> > > endpoint and artifact token to > >>> > allow the > >>> > > expansion service to upload > any > >>> > > resources its environment may > >>> > need. For > >>> > > example, the expansion service > >>> > for the > >>> > > Beam Java SDK would be able to > >>> > upload jars. > >>> > > > >>> > > In the "docker" environment, > the > >>> > Apache > >>> > > Beam Java SDK harness > container would > >>> > > fetch the relevant artifacts > for > >>> > itself > >>> > > and be able to execute the > pipeline. > >>> > > (Note that a docker > environment could > >>> > > skip all this artifact > staging if the > >>> > > docker environment contained > all > >>> > > necessary artifacts). > >>> > > > >>> > > For the existing "external" > >>> > environment, > >>> > > it should already come with > all the > >>> > > resources prepackaged wherever > >>> > > "external" points to. The > "process" > >>> > > based environment could > choose to use > >>> > > the artifact staging service > to fetch > >>> > > those resources associated > with its > >>> > > process or it could follow > the same > >>> > > pattern that "external" would > do and > >>> > > already contain all the > prepackaged > >>> > > resources. Note that both > >>> > "external" and > >>> > > "process" will require the > >>> > instance of > >>> > > the expansion service to be > >>> > specialized > >>> > > for those environments which > is > >>> > why the > >>> > > default should for the > expansion > >>> > service > >>> > > to be the "docker" > environment. > >>> > > > >>> > > Note that a major reason for > >>> > going with > >>> > > docker containers as the > environment > >>> > > that all runners should > support > >>> > is that > >>> > > containers provides a solution > >>> > for this > >>> > > exact issue. Both the > "process" and > >>> > > "external" environments are > >>> > explicitly > >>> > > limiting and expanding their > >>> > > capabilities will quickly > have us > >>> > > building something like a > docker > >>> > > container because we'll > quickly find > >>> > > ourselves solving the same > >>> > problems that > >>> > > docker containers provide > (resources, > >>> > > file layout, permissions, ...) > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > On Thu, Apr 18, 2019 at 11:21 > AM > >>> > > Maximilian Michels > >>> > <m...@apache.org <mailto:m...@apache.org> > >>> > > <mailto:m...@apache.org > >>> > <mailto:m...@apache.org>>> wrote: > >>> > > > >>> > > Hi everyone, > >>> > > > >>> > > We have previously merged > support > >>> > > for configuring > transforms across > >>> > > languages. Please see > Cham's > >>> > summary > >>> > > on the discussion [1]. > There is > >>> > > also a design document > [2]. > >>> > > > >>> > > Subsequently, we've added > >>> > wrappers > >>> > > for cross-language > transforms > >>> > to the > >>> > > Python SDK, i.e. > >>> > GenerateSequence, > >>> > > ReadFromKafka, and there > is a > >>> > pending > >>> > > PR [1] for WriteToKafka. > All > >>> > of them > >>> > > utilize Java transforms > via > >>> > > cross-language > configuration. > >>> > > > >>> > > That is all pretty > exciting :) > >>> > > > >>> > > We still have some issues > to > >>> > solve, > >>> > > one being how to stage > >>> > artifact from > >>> > > a foreign environment. > When > >>> > we run > >>> > > external transforms which > are > >>> > part of > >>> > > Beam's core (e.g. > >>> > GenerateSequence), > >>> > > we have them available in > the SDK > >>> > > Harness. However, when > they > >>> > are not > >>> > > (e.g. KafkaIO) we need to > >>> > stage the > >>> > > necessary files. > >>> > > > >>> > > For my PR [3] I've > naively added > >>> > > > ":beam-sdks-java-io-kafka" to > >>> > the SDK > >>> > > Harness which caused > dependency > >>> > > problems [4]. Those could > be > >>> > resolved > >>> > > but the bigger question > is how to > >>> > > stage artifacts for > external > >>> > > transforms > programmatically? > >>> > > > >>> > > Heejong has solved this by > >>> > adding a > >>> > > "--jar_package" option to > the > >>> > Python > >>> > > SDK to stage Java files > [5]. > >>> > I think > >>> > > that is a better solution > than > >>> > > adding required Jars to > the SDK > >>> > > Harness directly, but it > is > >>> > not very > >>> > > convenient for users. > >>> > > > >>> > > I've discussed this today > with > >>> > > Thomas and we both figured > >>> > that the > >>> > > expansion service needs to > >>> > provide a > >>> > > list of required Jars > with the > >>> > > ExpansionResponse it > >>> > provides. It's > >>> > > not entirely clear, how we > >>> > determine > >>> > > which artifacts are > necessary > >>> > for an > >>> > > external transform. We > could just > >>> > > dump the entire classpath > >>> > like we do > >>> > > in PipelineResources for > Java > >>> > > pipelines. This provides > many > >>> > > unneeded classes but > would work. > >>> > > > >>> > > Do you think it makes > sense > >>> > for the > >>> > > expansion service to > provide the > >>> > > artifacts? Perhaps you > have a > >>> > better > >>> > > idea how to resolve the > staging > >>> > > problem in cross-language > >>> > pipelines? > >>> > > > >>> > > Thanks, > >>> > > Max > >>> > > > >>> > > [1] > >>> > > > >>> > > https://lists.apache.org/thread.html/b99ba8527422e31ec7bb7ad9dc3a6583551ea392ebdc5527b5fb4a67@%3Cdev.beam.apache.org%3E > >>> > > > >>> > > [2] > >>> > > https://s.apache.org/beam-cross-language-io > >>> > > > >>> > > [3] > >>> > > > https://github.com/apache/beam/pull/8322#discussion_r276336748 > >>> > > > >>> > > [4] Dependency graph for > >>> > > beam-runners-direct-java: > >>> > > > >>> > > beam-runners-direct-java > -> > >>> > > sdks-java-harness -> > >>> > > beam-sdks-java-io-kafka > >>> > > -> > beam-runners-direct-java > >>> > ... the > >>> > > cycle continues > >>> > > > >>> > > Beam-runners-direct-java > >>> > depends on > >>> > > sdks-java-harness due > >>> > > to the infamous Universal > Local > >>> > > Runner. > >>> > Beam-sdks-java-io-kafka depends > >>> > > on > beam-runners-direct-java for > >>> > > running tests. > >>> > > > >>> > > [5] > >>> > > https://github.com/apache/beam/pull/8340 > >>> > > > >>> > > >