On Fri, Oct 12, 2018 at 10:11 AM, Charles Chen <c...@google.com> wrote:

> For context, I made comments on https://github.com/apache/beam/pull/6600
> noting that the changes being made were not good for Beam
> backwards-compatibility.  The change as is allows users to use pipeline
> options without explicitly defining them, which is not the type of usage we
> would like to encourage since we prefer to be explicit whenever possible.
> If users write pipelines with this sort of pattern, they will potentially
> encounter pain when upgrading to a later version since this is an
> implementation detail and not an officially supported pattern.  I agree
> with the comments above that this is ultimately a scoping issue.  I would
> not have a problem with these changes if they were explicitly scoped under
> either a runner or unparsed options namespace.
>
> As a second note, since the 2.8.0 release is being cut right now, because
> of these backwards-compatibility concerns, I would suggest reverting these
> changes, at least until 2.8.0 is cut, so we can have a discussion here
> before committing to and releasing any API-level changes.
>

+1 I would like to revert the changes in order not rush this into the
release. Once this discussion results in an agreement changes can be
brought back.


>
> On Fri, Oct 12, 2018 at 9:26 AM Henning Rohde <hero...@google.com> wrote:
>
>> Agree that pipeline options lack some mechanism for scoping. It is also
>> not always possible distinguish options meant to be consumed at pipeline
>> construction time, by the runner, by the SDK harness, by the user code or
>> any combination -- and this causes confusion every now and then.
>>
>> For Dataflow, we have been using "experiments" for arbitrary
>> runner-specific options. It's simply a string list pipeline option that all
>> SDKs support and, for Go at least, is sent to portable runners. Flink can
>> do the same in the short term to move forward.
>>
>> Henning
>>
>>
>> On Fri, Oct 12, 2018 at 8:50 AM Thomas Weise <t...@apache.org> wrote:
>>
>>> [moving to the list]
>>>
>>> The requirement driving this part of the change was to allow a user to
>>> specify pipeline options that a runner supports without having to declare
>>> those in each language SDK.
>>>
>>> In the specific scenario, we have options that the Flink runner supports
>>> (and can validate), that are not enumerated in the Python SDK.
>>>
>>> I think we have a bigger problem scoping pipeline options. For example,
>>> the runner options are dumped into the SDK worker. There is also a
>>> possibility of name collisions. So I think this would benefit from broader
>>> feedback.
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> ---------- Forwarded message ---------
>>> From: Charles Chen <notificati...@github.com>
>>> Date: Fri, Oct 12, 2018 at 8:36 AM
>>> Subject: Re: [apache/beam] [BEAM-5442] Store duplicate unknown options
>>> in a list argument (#6600)
>>> To: apache/beam <b...@noreply.github.com>
>>> Cc: Thomas Weise <thomas.we...@gmail.com>, Mention <
>>> ment...@noreply.github.com>
>>>
>>>
>>> CC: @tweise <https://github.com/tweise>
>>>
>>> —
>>> You are receiving this because you were mentioned.
>>> Reply to this email directly, view it on GitHub
>>> <https://github.com/apache/beam/pull/6600#issuecomment-429367754>, or mute
>>> the thread
>>> <https://github.com/notifications/unsubscribe-auth/AAQGDwwt15R85eq9pySUisyxq2HYz-Vyks5ukLcLgaJpZM4XMo-T>
>>> .
>>>
>>

Reply via email to