On Wed, Nov 13, 2019 at 10:42 AM Luke Cwik <[email protected]> wrote:

> The original ideology was around having only those attributes that
> required to set it would contain the attribute but once something becomes
> common enough it makes sense to have it as an optional parameter so +1.
>
> Are there areas where the environment id will still exist outside of a
> PTransform?
>

Only scenario I can think of is, support for first order functions (UDFs)
in cross-language transforms where a function might have to be executed in
a different environment than the PTransform. But I don't think we should
make the very common case of having both PTransforms and associated
functions in the same environment hard/error-prone due to this. We could
later introduce specifying environment along with associated functions (and
any other properties we need) when we design support for first order
functions in cross-language transforms.

Thanks,
Cham


>
>
> On Tue, Nov 12, 2019 at 9:25 PM Chamikara Jayalath <[email protected]>
> wrote:
>
>> This was discussed in a JIRA [1] but don't think this was mentioned in
>> the dev list.
>>
>> Not having environment_id as a top level attribute of PTransform [2]
>> makes it difficult to track the Environment [3] a given PTransform should
>> be executed in. For example, in Dataflow, we have to fork code in several
>> places to filter out the Environment from a given PTransform proto.
>>
>> Making environment_id a top level attribute of PTransform and removing it
>> from various payload types will make tracking environments easier. Also
>> code will become less error prone since we don't have to fork for all
>> possible payload types.
>>
>> Any objections to doing this change ?
>>
>> Thanks,
>> Cham
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-7850
>> [2]
>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L99
>> [3]
>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L1021
>>
>

Reply via email to