On Fri, Oct 12, 2018 at 10:11 AM, Charles Chen <c...@google.com> wrote:
> For context, I made comments on https://github.com/apache/beam/pull/6600 > noting that the changes being made were not good for Beam > backwards-compatibility. The change as is allows users to use pipeline > options without explicitly defining them, which is not the type of usage we > would like to encourage since we prefer to be explicit whenever possible. > If users write pipelines with this sort of pattern, they will potentially > encounter pain when upgrading to a later version since this is an > implementation detail and not an officially supported pattern. I agree > with the comments above that this is ultimately a scoping issue. I would > not have a problem with these changes if they were explicitly scoped under > either a runner or unparsed options namespace. > > As a second note, since the 2.8.0 release is being cut right now, because > of these backwards-compatibility concerns, I would suggest reverting these > changes, at least until 2.8.0 is cut, so we can have a discussion here > before committing to and releasing any API-level changes. > +1 I would like to revert the changes in order not rush this into the release. Once this discussion results in an agreement changes can be brought back. > > On Fri, Oct 12, 2018 at 9:26 AM Henning Rohde <hero...@google.com> wrote: > >> Agree that pipeline options lack some mechanism for scoping. It is also >> not always possible distinguish options meant to be consumed at pipeline >> construction time, by the runner, by the SDK harness, by the user code or >> any combination -- and this causes confusion every now and then. >> >> For Dataflow, we have been using "experiments" for arbitrary >> runner-specific options. It's simply a string list pipeline option that all >> SDKs support and, for Go at least, is sent to portable runners. Flink can >> do the same in the short term to move forward. >> >> Henning >> >> >> On Fri, Oct 12, 2018 at 8:50 AM Thomas Weise <t...@apache.org> wrote: >> >>> [moving to the list] >>> >>> The requirement driving this part of the change was to allow a user to >>> specify pipeline options that a runner supports without having to declare >>> those in each language SDK. >>> >>> In the specific scenario, we have options that the Flink runner supports >>> (and can validate), that are not enumerated in the Python SDK. >>> >>> I think we have a bigger problem scoping pipeline options. For example, >>> the runner options are dumped into the SDK worker. There is also a >>> possibility of name collisions. So I think this would benefit from broader >>> feedback. >>> >>> Thanks, >>> Thomas >>> >>> >>> ---------- Forwarded message --------- >>> From: Charles Chen <notificati...@github.com> >>> Date: Fri, Oct 12, 2018 at 8:36 AM >>> Subject: Re: [apache/beam] [BEAM-5442] Store duplicate unknown options >>> in a list argument (#6600) >>> To: apache/beam <b...@noreply.github.com> >>> Cc: Thomas Weise <thomas.we...@gmail.com>, Mention < >>> ment...@noreply.github.com> >>> >>> >>> CC: @tweise <https://github.com/tweise> >>> >>> — >>> You are receiving this because you were mentioned. >>> Reply to this email directly, view it on GitHub >>> <https://github.com/apache/beam/pull/6600#issuecomment-429367754>, or mute >>> the thread >>> <https://github.com/notifications/unsubscribe-auth/AAQGDwwt15R85eq9pySUisyxq2HYz-Vyks5ukLcLgaJpZM4XMo-T> >>> . >>> >>