Hi David, Thanks for the feedback!
1. I think both ways can express the same semantics. I am just following the API design from the `RowData`, where we have a method to check null and return a primitive type. 2. It is indeed confusing, as the words Object and Map are used interchangeably in the FLIP. An object typed Variant is the same as a Map from key to variant. Because we don't have the notion of object in the SQL type, therefore MAP makes more sense to describe the Variant type. 3, 4. I am not sure if I understand your question. What do you mean by json_object would be returned? I don't think that we have a json_object type. If I understand correctly, JSON_OBJECT just returns the json string. It doesn't make much sense that the PARSE_JSON accept a Json string and return the same json string. Best, Xuannan On Fri, Apr 25, 2025 at 10:14 PM David Radley <david_rad...@uk.ibm.com> wrote: > > Hi Xuannan, > This looks like a good addition. > > > 1. I was wondering whether it is possible to have a type, but the value be > null – for example a null value in a Float type and tolerate nulls being > returned for float getFloat(). If so then maybe we should return an object > Float instead. > 2. You mention maps in the Flip text but do not have it has a type. I > wondered what your thinking is. > 3. In the new functions PARSE_JSON and TRY_PARSE_JSON, the text says they > parse to a variant. As we support JSON_OBJECT as well, there could be an > expectation that json_object would be the expected return type. Maybe we > could allow the user to choose what gets returned? > 4. Can variants be turned into json_objects and vice versa. > > Kind regards, David. > > From: Xuannan Su <suxuanna...@gmail.com> > Date: Friday, 25 April 2025 at 12:47 > To: dev@flink.apache.org <dev@flink.apache.org> > Subject: [EXTERNAL] Re: [DISCUSS] FLIP-521: Integrating Variant Type into > Flink: Enabling Efficient Semi-Structured Data Processing > Hi everyone, > > Thank you for all the comments! If there are no further comments, I'd > like to close the discussion and start the voting next Monday. > > Best, > Xuannan > > On Fri, Apr 25, 2025 at 7:41 PM Lincoln Lee <lincoln.8...@gmail.com> wrote: > > > > +1 for this FLIP. VARIANT type support will be a great addition to sql. > > Look forward to the detailed design of the subsequent shredding > > optimizations. > > > > > > Best, > > Lincoln Lee > > > > > > Timo Walther <twal...@apache.org> 于2025年4月22日周二 16:51写道: > > > > > +1 for this feature. Having a VARIANT type makes a lot of sense and > > > together with an OBJECT type will make semi-structured data processing > > > in Flink easier. > > > > > > Currently, I'm catching up with notifications after the easter holidays, > > > but happy to give some feedback by tomorrow or Thursday as well. > > > > > > Thanks, > > > Timo > > > > > > On 22.04.25 10:40, Jingsong Li wrote: > > > > Thanks Xuannan for driving this discussion. > > > > > > > > At present, communities such as Apache Iceberg, Delta, Spark, Parquet, > > > > etc. are all designing and developing around Variant, and our Flink > > > > support for Variant is very valuable. > > > > > > > > After a rough look at the design, there is no overall problem. It is > > > > designed around Parquet's Variant standard, which is similar to the > > > > overall design of Spark SQL. > > > > > > > > +1 for this. > > > > > > > > Best, > > > > Jingsong > > > > > > > > On Mon, Apr 14, 2025 at 6:12 PM Xuannan Su <suxuanna...@gmail.com> > > > wrote: > > > >> > > > >> Hi devs, > > > >> > > > >> I’d like to start a discussion around FLIP-521: Integrating Variant > > > >> Type into Flink: Enabling Efficient Semi-Structured Data > > > >> Processing[1]. Working with semi-structured data has long been a > > > >> foundational scenario of the Lakehouse. While JSON has traditionally > > > >> served as the primary storage format for such data, its implementation > > > >> as serialized strings introduces significant inefficiencies. > > > >> > > > >> In this FLIP, we integrate the Variant encoding, which is a compact > > > >> binary representation of semi-structured data[2], to improve the > > > >> performance of processing semi-structured data. As Paimon has > > > >> supported the Variant type recently[3], this FLIP would allow Flink to > > > >> further leverage Paimon's storage-layer optimizations, improving > > > >> performance and resource utilization for semi-structured data > > > >> pipelines. > > > >> > > > >> Best, > > > >> Xuannan > > > >> > > > >> [1] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-521%3A+Integrating+Variant+Type+into+Flink%3A+Enabling+Efficient+Semi-Structured+Data+Processing > > > >> [2] > > > https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > > > >> [3] https://github.com/apache/paimon/issues/4471 > > > > > > > > > > > > Unless otherwise stated above: > > IBM United Kingdom Limited > Registered in England and Wales with number 741598 > Registered office: Building C, IBM Hursley Office, Hursley Park Road, > Winchester, Hampshire SO21 2JN