After Kenneth review we have a few points to discuss and I'd like to
benefit from the mail to add some more thoughts than on github:

1. pivot format or not: in my PR I used the CLI format to just delegate to
fromArgs the parsing once the properties were converted to args. I like
that cause it enforces args syntax to work and prevent to abuse of an
internal pivot format which would support more than what a CLI can support
- like the JSON syntax which can be abused today. It also makes the code
very simple with a single entry code path which is quite neat IMHO.
2. default properties = env + system properties: this is what all config
libs do (spring config, tamaya, deltaspike, microprofile-config, ...) so it
is very comfortable and sane for end users. It enables ops (env) and dev
(system properties) to feel comfortable for free with this and avoids us to
have N entry points which has the nice side effect to consolidate the
overall consistency of the framework/lib.
3. default mecanism: I have in the PR a fromJvm() which uses the previous
(2) mecanism and filters entries using "beam." prefix. Intent is to
encourage integrators to use that instead of a custom prefix by default to
have some consistency accross frameworks. I understand there is not always
a single set of config per JVM but it is also a very common case to have
it. I estimate it at 85% for the pipeline creation part (I'm not speaking
of the workers here since it is way before that). Personally I think if the
investment is low (3 lines) and we cover some more use cases it does worth
it anyway. In this case it is a lot of cases so really important to do the
small quick win associated.
4. support null: I'd like to support null as a valid valid for properties.
The rational here is to make it neat for end user and support:
fromProperties(loadProps()) with a loadProps in the user codebase which can
return null. We just default to create(). For us it iss pretty much nothing
but it enables end users to not care about it. Concretely one null check
versus millions (or thousands at least ;)).

To summarize the overall philosophy: code is very simple and I don't care
much what it looks like, what I worry about is if the solution is usable or
not and if it is easy *for the end users* whatever it costs at beam level
because this part is not very impacting for beam itself. I understand the
"beam doesnt use a null parameter" spirit but it impacts users so it
shouldn't be a surfacing constraint and we should just handle it for users
IMHO. Same for env/system properties discussion. If we handle system
properties ops will expect we handle env variables so just do it, doesn't
cost anything for us and enable more use cases so it is a clear win-win.

Hope it makes sense



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-14 15:48 GMT+01:00 Romain Manni-Bucau <rmannibu...@gmail.com>:

> this is part of the PR without any transformation - as requested. I'm
> happy to do a follow up PR to do the same transformations than in
> deltaspike for instance: https://github.com/apache/deltaspike/blob/master/
> deltaspike/core/impl/src/main/java/org/apache/deltaspike/core/impl/config/
> EnvironmentPropertyConfigSource.java#L55 or a more fancy - but real world
> - one like key -> key.substring(prefix.length())
> .toUpperCase(ROOT).replace('.', '_')
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-14 15:42 GMT+01:00 Ismaël Mejía <ieme...@gmail.com>:
>
>> +1 to Romain idea with Kenn's approach, we should probably reserve the
>> 'beam.' namespace too because we are already using it for system
>> properties in the spark runner, following along the convention of
>> 'spark.*' in Apache Spark.
>>
>> Any ideas on how to handle this for the env vraiables case ? maybe this?
>> export my.random.namespace="SomeOption=bizzle SomeOtherOption=whatever"
>>
>>
>> On Wed, Feb 14, 2018 at 9:35 AM, Romain Manni-Bucau
>> <rmannibu...@gmail.com> wrote:
>> > FYI created https://github.com/apache/beam/pull/4683 about it
>> >
>> >
>> > Romain Manni-Bucau
>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >
>> > 2018-02-13 19:21 GMT+01:00 Romain Manni-Bucau <rmannibu...@gmail.com>:
>> >>
>> >> Oki, will try a PR after the classloader one then. Thanks a lot.
>> >>
>> >> Le 13 févr. 2018 19:14, "Lukasz Cwik" <lc...@google.com> a écrit :
>> >>>
>> >>> The one in Dataflow 1.x was one system property that contained all the
>> >>> JSON so it wasn't exactly what you were looking for.
>> >>>
>> >>> On Tue, Feb 13, 2018 at 9:51 AM, Romain Manni-Bucau
>> >>> <rmannibu...@gmail.com> wrote:
>> >>>>
>> >>>> I like your proposal Kenneth. Perfectly fits my use case and
>> deployment
>> >>>> one as well - when ops configure the env without modifying the code.
>> >>>>
>> >>>> How do we move forward on that? Should I send a PR or do you want to
>> >>>> import what was in dataflow?
>> >>>>
>> >>>>
>> >>>> Romain Manni-Bucau
>> >>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >>>>
>> >>>> 2018-02-13 18:49 GMT+01:00 Lukasz Cwik <lc...@google.com>:
>> >>>>>
>> >>>>> PipelineOptionsFactory.fromSystemProperties did exist in Dataflow
>> 1.x
>> >>>>> and was dropped for the reason that Ken mentioned.
>> >>>>>
>> >>>>> On Tue, Feb 13, 2018 at 9:46 AM, Kenneth Knowles <k...@google.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Pipeline options are not global - they are a property of a single
>> job.
>> >>>>>> The TestPipeline reads them from a very particular system property
>> because
>> >>>>>> it is a special testing rule.
>> >>>>>>
>> >>>>>> If you want a generic way to build pipeline options from a set of
>> >>>>>> system properties, it should be from an explicit prefix, not global
>> >>>>>> defaults. So if a user wants to put a bunch of options into
>> properties, they
>> >>>>>> choose a namespace like "my.random.namespace" and everything under
>> it can be
>> >>>>>> interpreted as a pipelineoption:
>> >>>>>>
>> >>>>>> my.random.namespace.SomeOption=bizzle
>> >>>>>> my.random.namespace.SomeOtherOption=whatever
>> >>>>>>
>> >>>>>> And you could do
>> >>>>>> PipelineOptionsFactory.fromSystemProperties("my.random.namespace")
>> >>>>>>
>> >>>>>> We should use the full name of the option not split it with dots -
>> >>>>>> dots represent hierarchy not words separation.
>> >>>>>>
>> >>>>>> interface FooOptions extends PipelineOptions {
>> >>>>>>   getSomeOption()
>> >>>>>>   getSomeOtherOption()
>> >>>>>> }
>> >>>>>>
>> >>>>>> I think I can be +0.5 on this. We might also reserve a namespace
>> like
>> >>>>>> "beam.TestPipeline.options" and use it for specification of test
>> config.
>> >>>>>> Could be easier than embedding JSON in a property, in some cases.
>> Easier to
>> >>>>>> override little pieces for sure.
>> >>>>>>
>> >>>>>> Kenn
>> >>>>>>
>> >>>>>> On Tue, Feb 13, 2018 at 9:29 AM, Romain Manni-Bucau
>> >>>>>> <rmannibu...@gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> makes sense, do we want beam.foo.bar -> --foo-bar conversion too?
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Romain Manni-Bucau
>> >>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >>>>>>>
>> >>>>>>> 2018-02-13 18:19 GMT+01:00 Eugene Kirpichov <kirpic...@google.com
>> >:
>> >>>>>>>>
>> >>>>>>>> Neutral about this one: haven't seen a case where this was
>> needed,
>> >>>>>>>> but don't see anything wrong with it either. One thing I'd
>> recommend if you
>> >>>>>>>> go through with it, extract from system properties under "beam."
>> rather than
>> >>>>>>>> all of them, to avoid clashes.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Tue, Feb 13, 2018, 7:53 AM Jean-Baptiste Onofré <
>> j...@nanthrax.net>
>> >>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>> Hi Romain,
>> >>>>>>>>>
>> >>>>>>>>> it sounds interesting to me, and doesn't break anything, so +1
>> from
>> >>>>>>>>> my side.
>> >>>>>>>>>
>> >>>>>>>>> Regards
>> >>>>>>>>> JB
>> >>>>>>>>>
>> >>>>>>>>> On 02/13/2018 03:42 PM, Romain Manni-Bucau wrote:
>> >>>>>>>>> > Hi guys,
>> >>>>>>>>> >
>> >>>>>>>>> > there are hacks in beam testing code to read the args from a
>> >>>>>>>>> > system property but
>> >>>>>>>>> > I wonder if we shouldnt add a
>> >>>>>>>>> > PipelineOptionsFactory.fromSystemProperties().
>> >>>>>>>>> >
>> >>>>>>>>> > It would iterate over the system properties and take all
>> >>>>>>>>> > --xxx=foo as potential
>> >>>>>>>>> > argument it tries to bind.
>> >>>>>>>>> >
>> >>>>>>>>> > Rational behind that is to enable users to wrap the pipeline
>> API
>> >>>>>>>>> > but still
>> >>>>>>>>> > expose the pipeline options to end users for advanced cases.
>> >>>>>>>>> >
>> >>>>>>>>> > Any discussion on this kind of usages already? What do you
>> think
>> >>>>>>>>> > of this proposal?
>> >>>>>>>>> >
>> >>>>>>>>> > Side note: we can think about a fromEnv() too.
>> >>>>>>>>> >
>> >>>>>>>>> > Romain Manni-Bucau
>> >>>>>>>>> > @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> >>>>>>>>> > <https://rmannibucau.metawerx.net/> | Old Blog
>> >>>>>>>>> > <http://rmannibucau.wordpress.com> | Github
>> >>>>>>>>> > <https://github.com/rmannibucau> |
>> >>>>>>>>> > LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
>> >>>>>>>>> >
>> >>>>>>>>> > <https://www.packtpub.com/application-development/java-ee-8-
>> high-performance>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Jean-Baptiste Onofré
>> >>>>>>>>> jbono...@apache.org
>> >>>>>>>>> http://blog.nanthrax.net
>> >>>>>>>>> Talend - http://www.talend.com
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >
>>
>
>

Reply via email to