Re: How to create schema for flexible json data in Flink SQL

2020-06-01 Thread Guodong Wang
Hi Jark, You totally got my point. Actually, the perfect solution in my opinion is to support schema evolution in one query. Although classic SQL needs to know the schema before do any computing, when integrating the nosql data source to flink datastream, if schema evolution is possible, it will s

Re: How to create schema for flexible json data in Flink SQL

2020-05-31 Thread Jark Wu
Hi all, This is an interesting topic. Schema inference will be the next big feature planned in the next release. I added this thread link into FLINK-16420. I think the case of Guodong is schema evolution, which I think there is something to do with schema inference. I don't have a clear idea for

Re: How to create schema for flexible json data in Flink SQL

2020-05-29 Thread Guodong Wang
Benchao, Thank you for your detailed explanation. Schema Inference can solve my problem partially. For example, starting from some time, all the json afterward will contain a new field. I think for this case, schema inference will help. but if I need to handle all the json events with different s

Re: How to create schema for flexible json data in Flink SQL

2020-05-28 Thread Benchao Li
Hi Guodong, After an offline discussion with Leonard. I think you get the right meaning of schema inference. But there are two problems here: 1. schema of the data is fixed, schema inference can save your effort to write the schema explicitly. 2. schema of the data is dynamic, in this case the sch

Re: How to create schema for flexible json data in Flink SQL

2020-05-28 Thread Guodong Wang
Yes. Setting the value type as raw is one possible approach. And I would like to vote for schema inference as well. Correct me if I am wrong, IMO schema inference means I can provide a method in the table source to infer the data schema base on the runtime computation. Just like some calcite adapt

Re: How to create schema for flexible json data in Flink SQL

2020-05-28 Thread Benchao Li
Hi Guodong, Does the RAW type meet your requirements? For example, you can specify map type, and the value for the map is the raw JsonNode parsed from Jackson. This is not supported yet, however IMO this could be supported. Guodong Wang 于2020年5月28日周四 下午9:43写道: > Benchao, > > Thank you for your

Re: How to create schema for flexible json data in Flink SQL

2020-05-28 Thread Leonard Xu
Hi, guodong > I am wondering if Flink SQL can/will support the flexible schema in the > future, It’s an interesting topic, this feature is more close to the scope of schema inference. The schema inference should come in next few releases. Best, Leonard Xu > for example, register the tab

Re: How to create schema for flexible json data in Flink SQL

2020-05-28 Thread Guodong Wang
Benchao, Thank you for your quick reply. As you mentioned, for current scenario, approach 2 should work for me. But it is a little bit annoying that I have to modify schema to add new field types when upstream app changes the json format or adds new fields. Otherwise, my user can not refer the fi

Re: How to create schema for flexible json data in Flink SQL

2020-05-28 Thread Benchao Li
Hi Guodong, I think you almost get the answer, 1. map type, it's not working for current implementation. For example, use map, if the value if non-string json object, then `JsonNode.asText()` may not work as you wish. 2. list all fields you cares. IMO, this can fit your scenario. And you can set f