Hi, guodong > I am wondering if Flink SQL can/will support the flexible schema in the > future,
It’s an interesting topic, this feature is more close to the scope of schema inference. The schema inference should come in next few releases. Best, Leonard Xu > for example, register the table without defining specific schema for each > field, to let user define a generic map or array for one field. but the value > of map/array can be any object. Then, the type conversion cost might be > saved. > > Guodong > > > On Thu, May 28, 2020 at 7:43 PM Benchao Li <libenc...@gmail.com > <mailto:libenc...@gmail.com>> wrote: > Hi Guodong, > > I think you almost get the answer, > 1. map type, it's not working for current implementation. For example, use > map<varchar, varchar>, if the value if non-string json object, then > `JsonNode.asText()` may not work as you wish. > 2. list all fields you cares. IMO, this can fit your scenario. And you can > set format.fail-on-missing-field = true, to allow setting non-existed fields > to be null. > > For 1, I think maybe we can support it in the future, and I've created > jira[1] to track this. > > [1] https://issues.apache.org/jira/browse/FLINK-18002 > <https://issues.apache.org/jira/browse/FLINK-18002> > Guodong Wang <wangg...@gmail.com <mailto:wangg...@gmail.com>> 于2020年5月28日周四 > 下午6:32写道: > Hi ! > > I want to use Flink SQL to process some json events. It is quite challenging > to define a schema for the Flink SQL table. > > My data source's format is some json like this > { > "top_level_key1": "some value", > "nested_object": { > "nested_key1": "abc", > "nested_key2": 123, > "nested_key3": ["element1", "element2", "element3"] > } > } > > The big challenges for me to define a schema for the data source are > 1. the keys in nested_object are flexible, there might be 3 unique keys or > more unique keys. If I enumerate all the keys in the schema, I think my code > is fragile, how to handle event which contains more nested_keys in > nested_object ? > 2. I know table api support Map type, but I am not sure if I can put generic > object as the value of the map. Because the values in nested_object are of > different types, some of them are int, some of them are string or array. > > So. how to expose this kind of json data as table in Flink SQL without > enumerating all the nested_keys? > > Thanks. > > Guodong > > > -- > > Best, > Benchao Li