I think Arvid has a good point. Why not define Object type without class and when you get it in table api, try to cast it to some class? I found https://docs.oracle.com/javase/1.5.0/docs/guide/jdbc/getstart/mapping.html. Under `JAVA_OBJECT` type section. They have:
``` ResultSet rs = stmt.executeQuery("SELECT ENGINEERS FROM PERSONNEL"); while (rs.next()) { Engineer eng = (Engineer)rs.getObject("ENGINEERS"); System.out.println(eng.lastName + ", " + eng.firstName); } ``` For us, how about add `getFieldAs(int post, Class class)` method in Row type? Your example: ``` TableEnvironment env = ... Table t = env.sqlQuery("SELECT OBJECT_OF('com.example.User', 'name', 'Bob', 'age', 42)"); // Tries to resolve `com.example.User` in the classpath, if not present returns `Row` t.execute().collect(); ``` Will be ``` TableEnvironment env = ... Table t = env.sqlQuery("SELECT OBJECT_OF('name', 'Bob', 'age', 42)"); // Tries to resolve `com.example.User` in the classpath, if not present returns `Row` For (Row row : t.execute().collect()) { User user = row.getFieldAs(0, User.class); } ``` For Arvid's question: "However, at that point, why do we actually need anything beyond ROW?" Maybe the difference is Row type shouldn't support to be casted as user defined class but `StructuredType` can be. Thanks, Hao On Wed, Apr 23, 2025 at 2:04 AM Arvid Heise <ahe...@confluent.io.invalid> wrote: > Hi Timo, > > thanks for addressing my points. I'm not set on using STRUCT et al. but > wanted to point out the alternatives. > > Regarding the attached class name, I have similar confusion to Hao. I > wonder if Structures types shouldn't be anonymous by default in the sense > that initially we don't attach a class name to it. As you pointed out, it > has no real semantics in SQL and we can't validate it. > Another thing to consider is that if one user creates a table through some > means and another user wants to consume it, the second user may not have > access to the class as is. But the user could easily create a compatible > class on its own. > > Consequently, I'm thinking about getting rid of the type at all. Only on > the edges, we can use conversion to the user types when users actually > access the ROW: > * Any table API access that wants to collect results (in your last example > what is t.execute().collect(); returning? How does that work in the > multi-user setup sketched above? Wouldn't it be easier that the consumer > explicitly gives us the POJO type that it expects?) > * Any DataStream conversion > * Any UDF > > However, at that point, why do we actually need anything beyond ROW? > > Best, > > Arvid > > On Wed, Apr 23, 2025 at 8:52 AM Timo Walther <twal...@apache.org> wrote: > > > Hi Hao, > > > > 1. Can `StructuredType` be nested? > > > > Yes this is supported. > > > > 2. What's the main reason the class won't be enforced in SQL? > > > > SQL should not care about classes. Within the SQL ecosystem, the SQL > > engine controls the data serialization and protocols. The SQL engine > > will not load the class. Classes are a concept of a JVM or Python API > > endpoint. This also the reason why a SQL ARRAY<BIGINT> can be > > represented as List<Long>, long[], Long[]. The latter are only concepts > > in the target programming language and might look different in Python. > > > > Regard, > > Timo > > > > > > On 22.04.25 23:54, Hao Li wrote: > > > Hi Timo, > > > > > > Thanks for the FLIP. +1 with a few questions: > > > > > > 1. Can `StructuredType` be nested? e.g. `STRUCTURED<'com.example.User', > > > name STRING, age INT NOT NULL, address > STRUCTURED<'com.example.address', > > > street STRING, zip STRING>>` > > > > > > 2. What's the main reason the class won't be enforced in SQL? Since > > tables > > > created in SQL can also be used in Table API, will it come as a > surprise > > if > > > it's working in SQL and then failing in Table API? What if > > > `com.example.User` was not validated in SQL when creating table, then > the > > > class was created for something else with different fields and then in > > > Table api, it's not compatible. > > > > > > Hao > > > > > > On Tue, Apr 22, 2025 at 9:39 AM Timo Walther <twal...@apache.org> > wrote: > > > > > >> Hi Arvid, Hi Sergey, > > >> > > >> thanks for your feedback. I updated the FLIP accordingly but let me > > >> answer your questions > > >> here as well: > > >> > > >> > Are we going to enforce that the name is a valid class name? What > is > > >> > happening if it's not a correct name? > > >> > What are the implications of using a class that is not in the > > >> > classpath in Table API? It looks to me that the name is > > metadata-only > > >> > until we try to access the objects directly in Table/DataStream > API. > > >> > > >> Names are not enforced or validated. They are pure metadata as > mentioned > > >> in Section 2.1. We fallback to Row as the conversion class if the name > > >> cannot be resolved in the current classpath. So when staying in the > SQL > > >> ecosystem (i.e. not switching to Table API, DataStream API, or UDFs), > > >> the class must not be present. > > >> > > >> > Should Expressions.objectOf(String, Object... kv); also have an > > >> > overload where you can put in the StructuredType in case where > > >> > the class is not in the CP? > > >> > > >> That makes a lot of sense. I added a DataTypes.STRUCTURED(String, > > >> Field...) method and a Expressions.objectOf(String, Object...). > > >> > > >> > What is the expected outcome of supplying fewer keys than defined > > >> > in the structured type? Are we going to make use of nullability > > here? > > >> > If so, *_INSERT and *_REMOVE may have some use. > > >> > > >> Currently, we go with the most conservative approach, which means that > > >> all keys need to be present. Maybe we can reserve this feature to > future > > >> work and make the logic more lenient. > > >> > > >> > Talking about nullability: Is there some option to make the > declared > > >> > fields NOT NULL? If so, could you amend one example to show that? > > >> > (Grammar? implies that it's not possible) > > >> > > >> NOT NULL is supported similar to ROW<i INT NOT NULL>. I adjusted one > of > > >> the examples. > > >> > > >> > One bigger concern is around the naming. For me, OBJECT is used > for > > >> > semi-structured types that are open. Your FLIP implies a closed > > design > > >> > and that you want to add an open OBJECT later. I asked ChatGPT > about > > >> > other DB implementations and it seems like STRUCT is used more > often > > >> > (see below). So, I'd propose to call it STRUCT<...>, STRUCT_OF, > > > >> > structOf, UPDATE_STRUCT, and updateStruct respectively. > > >> > > >> Naming is hard. I was also torn between STRUCT, STRUCTURED, or OBJECT. > > >> In Flink, the ROW type is rather our STRUCT type, because it works > fully > > >> position based. Structured types might be name-based in the future for > > >> better schema evolution, so they rather model an OBJECT type. This was > > >> my reason for choosing OBJECT_OF (typed to class name and fixed > fields) > > >> vs. OBJECT (semi-structured without fixed fields). Snowflake also uses > > >> OBJECT(i INT) (for structured types) and OBJECT (for semi structured > > >> types). > > >> > > >> Also, both structured and semi-structured types can then share > functions > > >> such as UPDATE_OBJECT(). > > >> > > >> What do others think? > > >> > > >> Thanks, > > >> Timo > > >> > > >> On 22.04.25 12:08, Sergey Nuyanzin wrote: > > >>> Thanks for driving this Timo > > >>> > > >>> The FLIP seems reasonable to me > > >>> > > >>> I have one minor question/clarification > > >>> do I understand it correct that after this FLIP we can execute of > > >>> `typeof` against result of `OBJECT_OF` > > >>> for instance > > >>> SELECT typeof(OBJECT_OF( > > >>> 'com.example.User', > > >>> 'name', 'Bob', > > >>> 'age', 42 > > >>> )); > > >>> > > >>> should return `STRUCTURED<'com.example.User', name STRING, age INT>` > > >>> ? > > >>> > > >>> On Tue, Apr 22, 2025 at 10:57 AM Timo Walther <twal...@apache.org> > > >> wrote: > > >>>> > > >>>> Hi everyone, > > >>>> > > >>>> I would like to ask again for feedback on this FLIP. It is a rather > > >>>> small change but with big impact on usability for structured data. > > >>>> > > >>>> Are there any objections? Otherwise I would like to continue with > > voting > > >>>> soon. > > >>>> > > >>>> Thanks, > > >>>> Timo > > >>>> > > >>>> On 10.04.25 07:54, Timo Walther wrote: > > >>>>> Hi everyone, > > >>>>> > > >>>>> I would like to start a discussion about FLIP-520: Simplify > > >>>>> StructuredType handling [1]. > > >>>>> > > >>>>> Flink SQL already supports structured types in the engine, > > serializers, > > >>>>> UDFs, and connector interfaces. However, currently only Table API > was > > >>>>> able to make use of them. While UDFs can take objects as input and > > >>>>> return types, it is actually quite inconvenient to use them in > > >>>>> transformations. > > >>>>> > > >>>>> This FLIP fixes some immediate blockers in the use of structured > > types. > > >>>>> > > >>>>> Looking forward to feedback. > > >>>>> > > >>>>> Cheers, > > >>>>> Timo > > >>>>> > > >>>>> > > >>>>> [1] https://cwiki.apache.org/confluence/display/FLINK/ > > >>>>> FLIP-520%3A+Simplify+StructuredType+handling > > >>>>> > > >>>> > > >>> > > >>> > > >> > > >> > > > > > > > >