Hi All,

I've developed a version of this service using Docker Compose and it's
available here: https://github.com/apache/beam/pull/26023

Currently it consists of a controller container and a single expansion
service container (Java) but I hope to add a Python expansion service
container to this as well.

This can be used to easily start a service that hosts Beam transforms.

Once started, this service can be used by as many pipelines as needed to
expand/discover portable transforms available in Beam.

Specifically, for multi-language pipelines, this has the following benefits.

* No need to install runtimes for other languages when running pipelines in
a given language.
* No need to download external artifacts (for example, shaded expansion
service jar files). They will be served using local artifacts included in
the container. Also, in the future we can modify/optimize the set of
dependencies without updating the wrappers.
* No need to deal with multiple endpoints. A single endpoint serves all
expansion services.

I propose adding this to Beam and updating multi-language wrappers to use
this service when Docker is available in the system.

Please let me know if you have any comments or questions.

Thanks,
Cham

On Fri, Feb 10, 2023 at 4:00 PM Luke Cwik <lc...@google.com> wrote:

> Seems like a useful thing to me and will make it easier for Beam users
> overall.
>
> On Fri, Feb 10, 2023 at 3:56 PM Robert Bradshaw via dev <
> dev@beam.apache.org> wrote:
>
>> Thanks. I added some comments to the doc.
>>
>> On Mon, Feb 6, 2023 at 1:33 PM Chamikara Jayalath via dev
>> <dev@beam.apache.org> wrote:
>> >
>> > Hi All,
>> >
>> > Beam PTransforms are currently primarily identified as operations in a
>> pipeline that perform specific tasks. PTransform implementations were
>> traditionally linked to specific Beam SDKs.
>> >
>> > With the advent of portability framework, multi-language pipelines, and
>> expansion services that can be used to build/expand and discover
>> transforms, we have an opportunity to make this more general and
>> re-introduce Beam PTransforms as computation units that can serve any
>> use-case that needs to discover or use Beam transforms. For example, any
>> Beam SDK that runs a pipeline using a portable Beam runner should be able
>> to use a transform offered through an expansion service irrespective of the
>> implementation SDK of the transform or the pipeline.
>> >
>> > I believe we can make such use-cases much easier to manage by
>> introducing a user-deployable service that encapsulates existing Beam
>> expansion services in the form of a Kubernetes cluster. The service will
>> offer a single gRPC endpoint and will include Beam expansion services
>> developed in different languages. Any Beam pipeline, irrespective of the
>> pipeline SDK, should be able to use any transform offered by the service.
>> >
>> > This will also offer a way to make multi-language pipeline execution,
>> which currently relies on locally downloaded large dependencies and locally
>> started expansion service processes, more robust.
>> >
>> > I have written a proposal for implementing such a service and it's
>> available at https://s.apache.org/beam-transform-service.
>> >
>> > Please take a look and let me know if you have any comments or
>> questions.
>> >
>> > Thanks,
>> > Cham
>>
>

Reply via email to