Hi Dawid, I just want to mention one of your response,
> What you described with > 'format' = 'csv', > 'csv.allow-comments' = 'true', > 'csv.ignore-parse-errors' = 'true' > would not work though as the `format` prefix is mandatory in the sources as only the properties with format > will be passed to the format factory in majority of cases. We already have some implicit contracts. IIUC, in FLIP-95 and FLIP-122, the property key style are totally decided by connectors, not the framework. So I custom connector can define above properties, and extract the value of 'format', i.e. 'csv', to find the format factory. And extract the properties with `csv.` prefix and remove the prefix, and pass the properties (e.g. 'allow-comments' = 'true') into the format factory to create format. So there is no a strict guarantee to have a "nested JSON style" properties. Users can still develop a custom connector with this un-hierarchy properties and works well. 'format' = 'json', 'format.fail-on-missing-field' = 'false' Best, Jark On Thu, 30 Apr 2020 at 14:29, Dawid Wysakowicz <dwysakow...@apache.org> wrote: > Hi all, > > I'd like to start with a comment that I am ok with the current state of > the FLIP-122 if there is a strong preference for it. Nevertheless I still > like the idea of adding `type` to the `format` to have it as `format.type` > = `json`. > > I wanted to clarify a few things though: > > @Jingsong As far as I see it most of the users copy/paste the properties > from the documentation to the SQL, so I don't think additional four > characters are too cumbersome. Plus if you force the additional suffix onto > all the options of a format you introduce way more boilerplate than if we > added the `type/kind/name` > > @Kurt I agree that we cannot force it, but I think it is more of a > question to set standards/implicit contracts on the properties. What you > described with > 'format' = 'csv', > 'csv.allow-comments' = 'true', > 'csv.ignore-parse-errors' = 'true' > > would not work though as the `format` prefix is mandatory in the sources > as only the properties with format will be passed to the format factory in > majority of cases. We already have some implicit contracts. > > @Forward I did not necessarily get the example. Aren't json and bson two > separate formats? Do you mean you can have those two at the same time? Why > do you need to differentiate the options for each? The way I see it is: > > ‘format(.name)' = 'json', > ‘format.fail-on-missing-field' = 'false' > > or > > ‘format(.name)' = 'bson', > ‘format.fail-on-missing-field' = 'false' > > @Benchao I'd be fine with any of name, kind, type(this we already had in > the past) > > Best, > Dawid > > On 30/04/2020 04:17, Forward Xu wrote: > > Here I have a little doubt. At present, our json only supports the > conventional json format. If we need to implement json with bson, json with > avro, etc., how should we express it? > Do you need like the following: > > ‘format.name' = 'json', > > ‘format.json.fail-on-missing-field' = 'false' > > > ‘format.name' = 'bson', > > ‘format.bson.fail-on-missing-field' = ‘false' > > > Best, > > Forward > > Benchao Li <libenc...@gmail.com> <libenc...@gmail.com> 于2020年4月30日周四 上午9:58写道: > > > Thanks Timo for staring the discussion. > > Generally I like the idea to keep the config align with a standard like > json/yaml. > > From the user's perspective, I don't use table configs from a config file > like yaml or json for now, > And it's ok to change it to yaml like style. Actually we didn't know that > this could be a yaml like > configuration hierarchy. If it has a hierarchy, we maybe consider that in > the future to load the > config from a yaml/json file. > > Regarding the name, > 'format.kind' looks fine to me. However there is another name from the top > of my head: > 'format.name', WDYT? > > Dawid Wysakowicz <dwysakow...@apache.org> <dwysakow...@apache.org> > 于2020年4月29日周三 下午11:56写道: > > > Hi all, > > I also wanted to share my opinion. > > When talking about a ConfigOption hierarchy we use for configuring Flink > cluster I would be a strong advocate for keeping a yaml/hocon/json/... > compatible style. Those options are primarily read from a file and thus > should at least try to follow common practices for nested formats if we > ever decide to switch to one. > > Here the question is about the properties we use in SQL statements. The > origin/destination of these usually will be external catalog, usually in > > a > > flattened(key/value) representation so I agree it is not as important as > > in > > the aforementioned case. Nevertheless having a yaml based catalog or > > being > > able to have e.g. yaml based snapshots of a catalog in my opinion is > appealing. At the same time cost of being able to have a nice > yaml/hocon/json representation is just adding a single suffix to a > single(at most 2 key + value) property. The question is between `format` > > = > > `json` vs `format.kind` = `json`. That said I'd be slighty in favor of > doing it. > > Just to have a full picture. Both cases can be represented in yaml, but > the difference is significant: > format: 'json' > format.option: 'value' > > vs > format: > kind: 'json' > > option: 'value' > > Best, > Dawid > > On 29/04/2020 17:13, Flavio Pompermaier wrote: > > Personally I don't have any preference here. Compliance wih standard > > YAML > > parser is probably more important > > On Wed, Apr 29, 2020 at 5:10 PM Jark Wu <imj...@gmail.com> <imj...@gmail.com> > wrote: > > > From a user's perspective, I prefer the shorter one "format=json", > > because > > it's more concise and straightforward. The "kind" is redundant for > > users. > > Is there a real case requires to represent the configuration in JSON > style? > As far as I can see, I don't see such requirement, and everything works > fine by now. > > So I'm in favor of "format=json". But if the community insist to follow > code style on this, I'm also fine with the longer one. > > Btw, I also CC user mailing list to listen more user's feedback. > > Because I > > think this is relative to usability. > > Best, > Jark > > On Wed, 29 Apr 2020 at 22:09, Chesnay Schepler <ches...@apache.org> > <ches...@apache.org> > wrote: > > > > Therefore, should we advocate instead: > > > > 'format.kind' = 'json', > > 'format.fail-on-missing-field' = 'false' > > Yes. That's pretty much it. > > This is reasonable important to nail down as with such violations I > believe we could not actually switch to a standard YAML parser. > > On 29/04/2020 16:05, Timo Walther wrote: > > Hi everyone, > > discussions around ConfigOption seem to be very popular recently. > > So I > > would also like to get some opinions on a different topic. > > How do we represent hierarchies in ConfigOption? In FLIP-122, we > agreed on the following DDL syntax: > > CREATE TABLE fs_table ( > ... > ) WITH ( > 'connector' = 'filesystem', > 'path' = 'file:///path/to/whatever', > 'format' = 'csv', > 'format.allow-comments' = 'true', > 'format.ignore-parse-errors' = 'true' > ); > > Of course this is slightly different from regular Flink core > configuration but a connector still needs to be configured based on > these options. > > However, I think this FLIP violates our code style guidelines > > because > > 'format' = 'json', > 'format.fail-on-missing-field' = 'false' > > is an invalid hierarchy. `format` cannot be a string and a top-level > object at the same time. > > We have similar problems in our runtime configuration: > > state.backend= > state.backend.incremental= > restart-strategy= > restart-strategy.fixed-delay.delay= > high-availability= > high-availability.cluster-id= > > The code style guide states "Think of the configuration as nested > objects (JSON style)". So such hierarchies cannot be represented in > > a > > nested JSON style. > > Therefore, should we advocate instead: > > 'format.kind' = 'json', > 'format.fail-on-missing-field' = 'false' > > What do you think? > > Thanks, > Timo > > [1] > > > https://flink.apache.org/contributing/code-style-and-quality-components.html#configuration-changes > > -- > > Benchao Li > School of Electronics Engineering and Computer Science, Peking > UniversityTel:+86-15650713730 > Email: libenc...@gmail.com; libenc...@pku.edu.cn > >