Romain, how has Jackson been a classpath breaker?
On Tue, Jan 9, 2018 at 1:20 PM, Romain Manni-Bucau <rmannibu...@gmail.com> wrote: > Hmm, beam already owns the cli parsing - that is what I meant - it only > misses the arg delimiter (ie quoting) and adding it is easy no? > > Le 9 janv. 2018 21:19, "Robert Bradshaw" <rober...@google.com> a écrit : > >> On Tue, Jan 9, 2018 at 11:48 AM, Romain Manni-Bucau >> <rmannibu...@gmail.com> wrote: >> > >> > Le 9 janv. 2018 19:50, "Robert Bradshaw" <rober...@google.com> a écrit >> : >> > >> > From what I understand: >> > >> > 1) The command line argument values (both simple and more complex) are >> > all JSON-representable. >> > >> > And must be all CLI representable >> >> Sorry, I should have said "all pipeline options." In any case, one can >> always do JSON -> string and the resulting string is usable as a >> command line argument. >> >> > 2) The command line is a mapping of keys to these values. >> > >> > Which makes your 1 not true since json supports more ;) >> > >> > As such, it seems quite natural to represent the whole set as a single >> > JSON map, rather than using a different, custom encoding for the top >> > level (whose custom escaping would have to be carried into the inner >> > JSON values). Note that JSON has the advantage that one never needs to >> > explain or define it, and parsers/serializers already exists for all >> > languages (e.g. if one has a driver script in another language for >> > launching a java pipeline, it's easy to communicate all the args). >> > >> > Same reasonning applies to CLI AFAIK >> >> The spec of what a valid command line argument list is is surprisingly >> inconsistent across platforms, languages, and programs. And your >> proposal seems to be getting into what the delimiter is, and how to >> escape it, and possibly then how to escape the escape character. All >> of this is sidestepped by pointing at an existing spec. >> >> > We can't get rid of Jackson in the core because of (1) so there's >> > little value in adding complexity to remove it from (2). The fact that >> > Java doesn't ship anything in its expansive standard library for this >> > is unfortuante, so we have to take a dependency on something. >> > >> > We actually can as shown before >> >> How, if JSON is integral to the parsing of the argument values >> themselves? (Or is the argument that it's not, but each extender of >> PipelineOptions is responsible for choosing a JSON parser and doing it >> themselves.) >> >> > and jackson is not a simple negligible lib >> > but a classpath breaker :( >> >> Perhaps there are better libraries we could use instead? Is the >> ecosystem so bad that it encourages projects to roll their own of >> everything rather than share code? >> >> > On Tue, Jan 9, 2018 at 10:00 AM, Lukasz Cwik <lc...@google.com> wrote: >> >> Ah, you don't want JSON within JSON. I see, if thats the case just >> migrate >> >> all of them to string tokenization and drop the Jackson usage for >> string >> >> -> >> >> string[] conversion. >> >> >> >> On Mon, Jan 8, 2018 at 10:06 PM, Romain Manni-Bucau >> >> <rmannibu...@gmail.com> >> >> wrote: >> >>> >> >>> Lukasz, the use case is to znsure the config used can still map the >> CLI >> >>> and that options dont start to abuse json so it is exactly the >> opposite >> >>> of >> >>> "we can be fancy" and is closer to "we can be robust". Also the >> default >> >>> should be easy and not json (i just want to set the runner, >> --runner=xxx >> >>> is >> >>> easier than a json version). If the value doesnt start with a [ we can >> >>> use >> >>> the tokenizer else jackson, wdyt? >> >>> >> >>> >> >>> Le 8 janv. 2018 22:59, "Lukasz Cwik" <lc...@google.com> a écrit : >> >>> >> >>> Now that I think about this more. I looked at some of the examples in >> the >> >>> pom.xml and they don't seem to be tricky to write in JSON. >> >>> I also looked at the Jenkins job configurations (specifically the >> >>> performance tests) and they pass around maps which they convert to the >> >>> required format without needing a user to write it themselves. >> >>> Using gradle, we will be able to trivially to do the same thing >> (convert >> >>> maps to json without needing the person to write it by hand) like >> groovy >> >>> does. >> >>> Since we are migrating away from Maven it doesn't seem worthwhile to >> >>> spend >> >>> time on to make it easier to write the args in the Maven poms. >> >>> >> >>> Is there another use case that is being missed? >> >>> >> >>> On Mon, Jan 8, 2018 at 1:38 PM, Romain Manni-Bucau >> >>> <rmannibu...@gmail.com> >> >>> wrote: >> >>>> >> >>>> Good point for \t and ,. >> >>>> >> >>>> Any objection to use jackson as a fallback for that purpose - for >> >>>> backwqrd compat only - and make it optional then? Will create the >> ticket >> >>>> if >> >>>> not. >> >>>> >> >>>> Le 8 janv. 2018 20:32, "Robert Bradshaw" <rober...@google.com> a >> écrit : >> >>>>> >> >>>>> Part of the motivation to use JSON for more complex options was that >> >>>>> it avoids having to define (and document, test, have users learn, >> ...) >> >>>>> yet another format for expressing lists, maps, etc. >> >>>>> >> >>>>> On Mon, Jan 8, 2018 at 11:19 AM, Lukasz Cwik <lc...@google.com> >> wrote: >> >>>>> > Ken, this is specifically about running integration tests and not >> a >> >>>>> > users >> >>>>> > main(). >> >>>>> > >> >>>>> > Note, that PipelineOptions JSON format was used because it was a >> >>>>> > convenient >> >>>>> > serialized form that is easy to explain to people what is >> required. >> >>>>> > Using a different string tokenizer and calling >> >>>>> > PipelineOptionsFactory.fromArgs() with the parsed strings seems >> like >> >>>>> > it >> >>>>> > would be better. >> >>>>> > >> >>>>> > These are the supported formats for fromArgs(): >> >>>>> > * --project=MyProject (simple property, will set the >> "project" >> >>>>> > property >> >>>>> > to "MyProject") >> >>>>> > * --readOnly=true (for boolean properties, will set the >> >>>>> > "readOnly" >> >>>>> > property to "true") >> >>>>> > * --readOnly (shorthand for boolean properties, will set the >> >>>>> > "readOnly" >> >>>>> > property to "true") >> >>>>> > * --x=1 --x=2 --x=3 (list style simple property, will set the >> >>>>> > "x" >> >>>>> > property to [1, 2, 3]) >> >>>>> > * --x=1,2,3 (shorthand list style simple property, will set >> the >> >>>>> > "x" >> >>>>> > property to [1, 2, 3]) >> >>>>> > * --complexObject='{"key1":"value1",...} (JSON format for >> all >> >>>>> > other >> >>>>> > complex types) >> >>>>> > >> >>>>> > Using a string tokenizer that minimizes the number of required >> escape >> >>>>> > characters would be good so we could use newline characters as our >> >>>>> > only >> >>>>> > token. I would avoid ',\t ' as tokens since they are more likely >> to >> >>>>> > appear. >> >>>>> > >> >>>>> > On Mon, Jan 8, 2018 at 10:33 AM, Kenneth Knowles <k...@google.com> >> >>>>> > wrote: >> >>>>> >> >> >>>>> >> We do have a plain command line syntax, and whoever writes the >> >>>>> >> main(String[]) function is responsible for invoking the parser. >> It >> >>>>> >> isn't >> >>>>> >> quite as nice as standard arg parse libraries, but it isn't too >> bad. >> >>>>> >> It >> >>>>> >> would be great to improve, though. >> >>>>> >> >> >>>>> >> Jackson is for machine-to-machine communication or other >> situations >> >>>>> >> where >> >>>>> >> command line parsing doesn't work so well. >> >>>>> >> >> >>>>> >> Are we using these some other way? >> >>>>> >> >> >>>>> >> Kenn >> >>>>> >> >> >>>>> >> On Sun, Jan 7, 2018 at 7:21 AM, Romain Manni-Bucau >> >>>>> >> <rmannibu...@gmail.com> >> >>>>> >> wrote: >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> Le 7 janv. 2018 12:53, "Jean-Baptiste Onofré" <j...@nanthrax.net> >> a >> >>>>> >>> écrit : >> >>>>> >>> >> >>>>> >>> Hi Romain, >> >>>>> >>> >> >>>>> >>> I guess you are assuming that pipeline options are flat command >> >>>>> >>> line >> >>>>> >>> like >> >>>>> >>> argument, right ? >> >>>>> >>> >> >>>>> >>> Actually, theoretically, pipeline options can be represented as >> >>>>> >>> json, >> >>>>> >>> that's why we use jackson. >> >>>>> >>> The pipeline options can be serialized/deserialized as json. >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> Yes but if users (or dev ;)) start to use that then it is >> trivial >> >>>>> >>> to >> >>>>> >>> break the cli handling and fromArgs parsing or, if not, break >> the >> >>>>> >>> user >> >>>>> >>> experience. So at the end it is a kind of "yes but no", right? >> >>>>> >>> >> >>>>> >>> PS: already see some advanced users having a headache trying to >> >>>>> >>> pass >> >>>>> >>> pipeline options in json so using the plain command line syntax >> can >> >>>>> >>> be more >> >>>>> >>> friendly too. >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> The purpose is to remove the jackson dependencies ? >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> Later yes, seeing core dep tree I identify a lot of dep which >> can >> >>>>> >>> conflict in some env and are not really needed or bring much >> being >> >>>>> >>> in the >> >>>>> >>> core - like avro as mentionned in another thread. Can need a >> >>>>> >>> sanitizing >> >>>>> >>> round. Short term it was really a "why is it that complicated" >> ;). >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> Regards >> >>>>> >>> JB >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> On 01/07/2018 11:38 AM, Romain Manni-Bucau wrote: >> >>>>> >>>> >> >>>>> >>>> Hi guys, >> >>>>> >>>> >> >>>>> >>>> not sure i fully get why jackson is used to parse pipeline >> options >> >>>>> >>>> in >> >>>>> >>>> the testing integration >> >>>>> >>>> >> >>>>> >>>> why not using a token parser like [1] which allows to map 1-1 >> with >> >>>>> >>>> the >> >>>>> >>>> user interface (command line) the options? >> >>>>> >>>> >> >>>>> >>>> [1] >> >>>>> >>>> >> >>>>> >>>> >> >>>>> >>>> https://github.com/Talend/component-runtime/blob/master/comp >> onent-server/src/main/java/org/talend/sdk/component/serve >> r/lang/StringPropertiesTokenizer.java >> >>>>> >>>> >> >>>>> >>>> Romain Manni-Bucau >> >>>>> >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >> >>>>> >>>> <https://rmannibucau.metawerx.net/> | Old Blog >> >>>>> >>>> <http://rmannibucau.wordpress.com> | Github >> >>>>> >>>> <https://github.com/rmannibucau> >> >>>>> >>>> | LinkedIn <https://www.linkedin.com/in/rmannibucau> >> >>>>> >>> >> >>>>> >>> >> >>>>> >>> -- >> >>>>> >>> Jean-Baptiste Onofré >> >>>>> >>> jbono...@apache.org >> >>>>> >>> http://blog.nanthrax.net >> >>>>> >>> Talend - http://www.talend.com >> >>>>> >>> >> >>>>> >>> >> >>>>> >> >> >>>>> > >> >>> >> >>> >> >>> >> >> >> > >> > >> >