Romain, how has Jackson been a classpath breaker?

On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

> Hmm, beam already owns the cli parsing - that is what I meant - it only
> misses the arg delimiter (ie quoting) and adding it is easy no?
>
> Le 9 janv. 2018 21:19, "Robert Bradshaw" <rober...@google.com> a écrit :
>
>> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau
>> <rmannibu...@gmail.com> wrote:
>> >
>> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <rober...@google.com> a écrit
>> :
>> >
>> > From what I understand:
>> >
>> > 1) The command line argument values (both simple and more complex) are
>> > all JSON-representable.
>> >
>> > And must be all CLI representable
>>
>> Sorry, I should have said "all pipeline options." In any case, one can
>> always do JSON -> string and the resulting string is usable as a
>> command line argument.
>>
>> > 2) The command line is a mapping of keys to these values.
>> >
>> > Which makes your 1 not true since json supports more ;)
>> >
>> > As such, it seems quite natural to represent the whole set as a single
>> > JSON map, rather than using a different, custom encoding for the top
>> > level (whose custom escaping would have to be carried into the inner
>> > JSON values). Note that JSON has the advantage that one never needs to
>> > explain or define it, and parsers/serializers already exists for all
>> > languages (e.g. if one has a driver script in another language for
>> > launching a java pipeline, it's easy to communicate all the args).
>> >
>> > Same reasonning applies to CLI AFAIK
>>
>> The spec of what a valid command line argument list is is surprisingly
>> inconsistent across platforms, languages, and programs. And your
>> proposal seems to be getting into what the delimiter is, and how to
>> escape it, and possibly then how to escape the escape character. All
>> of this is sidestepped by pointing at an existing spec.
>>
>> > We can't get rid of Jackson in the core because of (1) so there's
>> > little value in adding complexity to remove it from (2). The fact that
>> > Java doesn't ship anything in its expansive standard library for this
>> > is unfortuante, so we have to take a dependency on something.
>> >
>> > We actually can as shown before
>>
>> How, if JSON is integral to the parsing of the argument values
>> themselves? (Or is the argument that it's not, but each extender of
>> PipelineOptions is responsible for choosing a JSON parser and doing it
>> themselves.)
>>
>> > and jackson is not a simple negligible lib
>> > but a classpath breaker :(
>>
>> Perhaps there are better libraries we could use instead? Is the
>> ecosystem so bad that it encourages projects to roll their own of
>> everything rather than share code?
>>
>> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com> wrote:
>> >> Ah, you don't want JSON within JSON. I see, if thats the case just
>> migrate
>> >> all of them to string tokenization and drop the Jackson usage for
>> string
>> >> ->
>> >> string[] conversion.
>> >>
>> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau
>> >> <rmannibu...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Lukasz, the use case is to znsure the config used can still map the
>> CLI
>> >>> and that options dont start to abuse json so it is exactly the
>> opposite
>> >>> of
>> >>> "we can be fancy" and is closer to "we can be robust". Also the
>> default
>> >>> should be easy and not json (i just want to set the runner,
>> --runner=xxx
>> >>> is
>> >>> easier than a json version). If the value doesnt start with a [ we can
>> >>> use
>> >>> the tokenizer else jackson, wdyt?
>> >>>
>> >>>
>> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit :
>> >>>
>> >>> Now that I think about this more. I looked at some of the examples in
>> the
>> >>> pom.xml and they don't seem to be tricky to write in JSON.
>> >>> I also looked at the Jenkins job configurations (specifically the
>> >>> performance tests) and they pass around maps which they convert to the
>> >>> required format without needing a user to write it themselves.
>> >>> Using gradle, we will be able to trivially to do the same thing
>> (convert
>> >>> maps to json without needing the person to write it by hand) like
>> groovy
>> >>> does.
>> >>> Since we are migrating away from Maven it doesn't seem worthwhile to
>> >>> spend
>> >>> time on to make it easier to write the args in the Maven poms.
>> >>>
>> >>> Is there another use case that is being missed?
>> >>>
>> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau
>> >>> <rmannibu...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Good point for \t and ,.
>> >>>>
>> >>>> Any objection to use jackson as a fallback for that purpose - for
>> >>>> backwqrd compat only - and make it optional then? Will create the
>> ticket
>> >>>> if
>> >>>> not.
>> >>>>
>> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a
>> écrit :
>> >>>>>
>> >>>>> Part of the motivation to use JSON for more complex options was that
>> >>>>> it avoids having to define (and document, test, have users learn,
>> ...)
>> >>>>> yet another format for expressing lists, maps, etc.
>> >>>>>
>> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com>
>> wrote:
>> >>>>> > Ken, this is specifically about running integration tests and not
>> a
>> >>>>> > users
>> >>>>> > main().
>> >>>>> >
>> >>>>> > Note, that PipelineOptions JSON format was used because it was a
>> >>>>> > convenient
>> >>>>> > serialized form that is easy to explain to people what is
>> required.
>> >>>>> > Using a different string tokenizer and calling
>> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings seems
>> like
>> >>>>> > it
>> >>>>> > would be better.
>> >>>>> >
>> >>>>> > These are the supported formats for fromArgs():
>> >>>>> >    *   --project=MyProject (simple property, will set the
>> "project"
>> >>>>> > property
>> >>>>> > to "MyProject")
>> >>>>> >    *   --readOnly=true (for boolean properties, will set the
>> >>>>> > "readOnly"
>> >>>>> > property to "true")
>> >>>>> >    *   --readOnly (shorthand for boolean properties, will set the
>> >>>>> > "readOnly"
>> >>>>> > property to "true")
>> >>>>> >    *   --x=1 --x=2 --x=3 (list style simple property, will set the
>> >>>>> > "x"
>> >>>>> > property to [1, 2, 3])
>> >>>>> >    *   --x=1,2,3 (shorthand list style simple property, will set
>> the
>> >>>>> > "x"
>> >>>>> > property to [1, 2, 3])
>> >>>>> >    *   --complexObject='{"key1":"value1",...} (JSON format for
>> all
>> >>>>> > other
>> >>>>> > complex types)
>> >>>>> >
>> >>>>> > Using a string tokenizer that minimizes the number of required
>> escape
>> >>>>> > characters would be good so we could use newline characters as our
>> >>>>> > only
>> >>>>> > token. I would avoid ',\t ' as tokens since they are more likely
>> to
>> >>>>> > appear.
>> >>>>> >
>> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <k...@google.com>
>> >>>>> > wrote:
>> >>>>> >>
>> >>>>> >> We do have a plain command line syntax, and whoever writes the
>> >>>>> >> main(String[]) function is responsible for invoking the parser.
>> It
>> >>>>> >> isn't
>> >>>>> >> quite as nice as standard arg parse libraries, but it isn't too
>> bad.
>> >>>>> >> It
>> >>>>> >> would be great to improve, though.
>> >>>>> >>
>> >>>>> >> Jackson is for machine-to-machine communication or other
>> situations
>> >>>>> >> where
>> >>>>> >> command line parsing doesn't work so well.
>> >>>>> >>
>> >>>>> >> Are we using these some other way?
>> >>>>> >>
>> >>>>> >> Kenn
>> >>>>> >>
>> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau
>> >>>>> >> <rmannibu...@gmail.com>
>> >>>>> >> wrote:
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <j...@nanthrax.net>
>> a
>> >>>>> >>> écrit :
>> >>>>> >>>
>> >>>>> >>> Hi Romain,
>> >>>>> >>>
>> >>>>> >>> I guess you are assuming that pipeline options are flat command
>> >>>>> >>> line
>> >>>>> >>> like
>> >>>>> >>> argument, right ?
>> >>>>> >>>
>> >>>>> >>> Actually, theoretically,  pipeline options can be represented as
>> >>>>> >>> json,
>> >>>>> >>> that's why we use jackson.
>> >>>>> >>> The pipeline options can be serialized/deserialized as json.
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is
>> trivial
>> >>>>> >>> to
>> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break
>> the
>> >>>>> >>> user
>> >>>>> >>> experience. So at the end it is a kind of "yes but no", right?
>> >>>>> >>>
>> >>>>> >>> PS: already see some advanced users having a headache trying to
>> >>>>> >>> pass
>> >>>>> >>> pipeline options in json so using the plain command line syntax
>> can
>> >>>>> >>> be more
>> >>>>> >>> friendly too.
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> The purpose is to remove the jackson dependencies ?
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which
>> can
>> >>>>> >>> conflict in some env and are not really needed or bring much
>> being
>> >>>>> >>> in the
>> >>>>> >>> core - like avro as mentionned in another thread. Can need a
>> >>>>> >>> sanitizing
>> >>>>> >>> round. Short term it was really a "why is it that complicated"
>> ;).
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> Regards
>> >>>>> >>> JB
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote:
>> >>>>> >>>>
>> >>>>> >>>> Hi guys,
>> >>>>> >>>>
>> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline
>> options
>> >>>>> >>>> in
>> >>>>> >>>> the testing integration
>> >>>>> >>>>
>> >>>>> >>>> why not using a token parser like [1] which allows to map 1-1
>> with
>> >>>>> >>>> the
>> >>>>> >>>> user interface (command line) the options?
>> >>>>> >>>>
>> >>>>> >>>> [1]
>> >>>>> >>>>
>> >>>>> >>>>
>> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/comp
>> onent-server/src/main/java/org/talend/sdk/component/serve
>> r/lang/StringPropertiesTokenizer.java
>> >>>>> >>>>
>> >>>>> >>>> Romain Manni-Bucau
>> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog
>> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog
>> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github
>> >>>>> >>>> <https://github.com/rmannibucau>
>> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau>
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>> --
>> >>>>> >>> Jean-Baptiste Onofré
>> >>>>> >>> jbono...@apache.org
>> >>>>> >>> http://blog.nanthrax.net
>> >>>>> >>> Talend - http://www.talend.com
>> >>>>> >>>
>> >>>>> >>>
>> >>>>> >>
>> >>>>> >
>> >>>
>> >>>
>> >>>
>> >>
>> >
>> >
>>
>

Reply via email to