Re: RFC: Assigning environments to transforms in a pipeline

2019-10-17 Thread Chamikara Jayalath
Sounds great.

One thing we probably should address is identifying environments that are
incompatible with runners and trying to fail pipelines early when such
incompatibilities are detected. I think currently we assign environments to
transforms in following well defined locations.

(1) for cross-language transforms expansion service returns the correct
environment that should be used
(2) for everything else, runner sets a proper default environment

If users can set arbitrary environments for transforms we should probably
have mechanisms in place so that runners can detect incompatibilities early
and error out with clear error messages.

Thanks,
Cham

On Wed, Oct 16, 2019 at 6:15 PM Chad Dombrova  wrote:

> Hi Robert,
>
>> Sounds nice. Is there a design doc (or, perhaps, you could just give an
>> example of what this would look like in this thread)?
>>
>
> I'll follow up shortly with something.  The good news is that this first
> PR is quite straightforward and (I think) is independent of the semantics
> of how an Environment will ultimately be used.  This PR just answers the
> questions: "how do we represent an Environment outside of the portability
> framework, and how do we convert that to and from the portability
> representation?".  We already have very well established patterns for these
> questions.
>
> -chad
>
>
>
>


Re: RFC: Assigning environments to transforms in a pipeline

2019-10-16 Thread Chad Dombrova
Hi Robert,

> Sounds nice. Is there a design doc (or, perhaps, you could just give an
> example of what this would look like in this thread)?
>

I'll follow up shortly with something.  The good news is that this first PR
is quite straightforward and (I think) is independent of the semantics of
how an Environment will ultimately be used.  This PR just answers the
questions: "how do we represent an Environment outside of the portability
framework, and how do we convert that to and from the portability
representation?".  We already have very well established patterns for these
questions.

-chad


Re: RFC: Assigning environments to transforms in a pipeline

2019-10-16 Thread Robert Bradshaw
Sounds nice. Is there a design doc (or, perhaps, you could just give an
example of what this would look like in this thread)?

On Wed, Oct 16, 2019 at 5:51 PM Chad Dombrova  wrote:

> Hi all,
> One of our goals for the portability framework is to be able to assign
> different environments to different segments of a pipeline.  This is not
> possible right now because environments are a concept that really only
> exist in the portable runner as protobuf messages:  they lack a proper API
> on the pipeline definition side of things.
>
> As a first step toward our goal, one of our team members just created a
> PR[1] exposing environments as a proper class hierarchy, akin to
> PTransforms, PCollections, Coders, etc. It's quite straightforward, and we
> were careful to adhere to existing patterns for similar types, so hopefully
> the end result feels natural.  After this PR is merged, our next step will
> be to create a proposal for assigning environments to transforms.
>
> Let us know what you think!
>
> -chad
>
> [1] https://github.com/apache/beam/pull/9811
>
>