I think Arvid has a good point. Why not define Object type without class
and when you get it in table api, try to cast it to some class? I found
https://docs.oracle.com/javase/1.5.0/docs/guide/jdbc/getstart/mapping.html.
Under `JAVA_OBJECT` type section. They have:

```

ResultSet rs = stmt.executeQuery("SELECT ENGINEERS FROM PERSONNEL");
while (rs.next()) {
Engineer eng = (Engineer)rs.getObject("ENGINEERS");
System.out.println(eng.lastName + ", " + eng.firstName);
}

```

For us, how about add `getFieldAs(int post, Class class)` method in Row
type? Your example:

```

TableEnvironment env = ...

Table t = env.sqlQuery("SELECT OBJECT_OF('com.example.User', 'name', 'Bob',
'age', 42)");

// Tries to resolve `com.example.User` in the classpath, if not present
returns `Row`
t.execute().collect();
```

Will be
```
TableEnvironment env = ...

Table t = env.sqlQuery("SELECT OBJECT_OF('name', 'Bob', 'age', 42)");

// Tries to resolve `com.example.User` in the classpath, if not present
returns `Row`
For (Row row : t.execute().collect()) {
    User user = row.getFieldAs(0, User.class);
}
```

For Arvid's question: "However, at that point, why do we actually need
anything beyond ROW?"

Maybe the difference is Row type shouldn't support to be casted as user
defined class but `StructuredType` can be.

Thanks,
Hao

On Wed, Apr 23, 2025 at 2:04 AM Arvid Heise <ahe...@confluent.io.invalid>
wrote:

> Hi Timo,
>
> thanks for addressing my points. I'm not set on using STRUCT et al. but
> wanted to point out the alternatives.
>
> Regarding the attached class name, I have similar confusion to Hao. I
> wonder if Structures types shouldn't be anonymous by default in the sense
> that initially we don't attach a class name to it. As you pointed out, it
> has no real semantics in SQL and we can't validate it.
> Another thing to consider is that if one user creates a table through some
> means and another user wants to consume it, the second user may not have
> access to the class as is. But the user could easily create a compatible
> class on its own.
>
> Consequently, I'm thinking about getting rid of the type at all. Only on
> the edges, we can use conversion to the user types when users actually
> access the ROW:
> * Any table API access that wants to collect results (in your last example
> what is t.execute().collect(); returning? How does that work in the
> multi-user setup sketched above? Wouldn't it be easier that the consumer
> explicitly gives us the POJO type that it expects?)
> * Any DataStream conversion
> * Any UDF
>
> However, at that point, why do we actually need anything beyond ROW?
>
> Best,
>
> Arvid
>
> On Wed, Apr 23, 2025 at 8:52 AM Timo Walther <twal...@apache.org> wrote:
>
> > Hi Hao,
> >
> > 1. Can `StructuredType` be nested?
> >
> > Yes this is supported.
> >
> > 2. What's the main reason the class won't be enforced in SQL?
> >
> > SQL should not care about classes. Within the SQL ecosystem, the SQL
> > engine controls the data serialization and protocols. The SQL engine
> > will not load the class. Classes are a concept of a JVM or Python API
> > endpoint. This also the reason why a SQL ARRAY<BIGINT> can be
> > represented as List<Long>, long[], Long[]. The latter are only concepts
> > in the target programming language and might look different in Python.
> >
> > Regard,
> > Timo
> >
> >
> > On 22.04.25 23:54, Hao Li wrote:
> > > Hi Timo,
> > >
> > > Thanks for the FLIP. +1 with a few questions:
> > >
> > > 1. Can `StructuredType` be nested? e.g. `STRUCTURED<'com.example.User',
> > > name STRING, age INT NOT NULL, address
> STRUCTURED<'com.example.address',
> > > street STRING, zip STRING>>`
> > >
> > > 2. What's the main reason the class won't be enforced in SQL? Since
> > tables
> > > created in SQL can also be used in Table API, will it come as a
> surprise
> > if
> > > it's working in SQL and then failing in Table API? What if
> > > `com.example.User` was not validated in SQL when creating table, then
> the
> > > class was created for something else with different fields and then in
> > > Table api, it's not compatible.
> > >
> > > Hao
> > >
> > > On Tue, Apr 22, 2025 at 9:39 AM Timo Walther <twal...@apache.org>
> wrote:
> > >
> > >> Hi Arvid, Hi Sergey,
> > >>
> > >> thanks for your feedback. I updated the FLIP accordingly but let me
> > >> answer your questions
> > >> here as well:
> > >>
> > >>   > Are we going to enforce that the name is a valid class name? What
> is
> > >>   > happening if it's not a correct name?
> > >>   > What are the implications of using a class that is not in the
> > >>   > classpath in Table API? It looks to me that the name is
> > metadata-only
> > >>   > until we try to access the objects directly in Table/DataStream
> API.
> > >>
> > >> Names are not enforced or validated. They are pure metadata as
> mentioned
> > >> in Section 2.1. We fallback to Row as the conversion class if the name
> > >> cannot be resolved in the current classpath. So when staying in the
> SQL
> > >> ecosystem (i.e. not switching to Table API, DataStream API, or UDFs),
> > >> the class must not be present.
> > >>
> > >>   > Should Expressions.objectOf(String, Object... kv); also have an
> > >>   > overload where you can put in the StructuredType in case where
> > >>   > the class is not in the CP?
> > >>
> > >> That makes a lot of sense. I added a DataTypes.STRUCTURED(String,
> > >> Field...) method and a Expressions.objectOf(String, Object...).
> > >>
> > >>   > What is the expected outcome of supplying fewer keys than defined
> > >>   > in the structured type? Are we going to make use of nullability
> > here?
> > >>   > If so, *_INSERT and *_REMOVE may have some use.
> > >>
> > >> Currently, we go with the most conservative approach, which means that
> > >> all keys need to be present. Maybe we can reserve this feature to
> future
> > >> work and make the logic more lenient.
> > >>
> > >>   > Talking about nullability: Is there some option to make the
> declared
> > >>   > fields NOT NULL? If so, could you amend one example to show that?
> > >>   > (Grammar? implies that it's not possible)
> > >>
> > >> NOT NULL is supported similar to ROW<i INT NOT NULL>. I adjusted one
> of
> > >> the examples.
> > >>
> > >>   > One bigger concern is around the naming. For me, OBJECT is used
> for
> > >>   > semi-structured types that are open. Your FLIP implies a closed
> > design
> > >>   > and that you want to add an open OBJECT later. I asked ChatGPT
> about
> > >>   > other DB implementations and it seems like STRUCT is used more
> often
> > >>   > (see below). So, I'd propose to call it STRUCT<...>, STRUCT_OF, >
> > >>   > structOf, UPDATE_STRUCT, and updateStruct respectively.
> > >>
> > >> Naming is hard. I was also torn between STRUCT, STRUCTURED, or OBJECT.
> > >> In Flink, the ROW type is rather our STRUCT type, because it works
> fully
> > >> position based. Structured types might be name-based in the future for
> > >> better schema evolution, so they rather model an OBJECT type. This was
> > >> my reason for choosing OBJECT_OF (typed to class name and fixed
> fields)
> > >> vs. OBJECT (semi-structured without fixed fields). Snowflake also uses
> > >> OBJECT(i INT) (for structured types) and OBJECT (for semi structured
> > >> types).
> > >>
> > >> Also, both structured and semi-structured types can then share
> functions
> > >> such as UPDATE_OBJECT().
> > >>
> > >> What do others think?
> > >>
> > >> Thanks,
> > >> Timo
> > >>
> > >> On 22.04.25 12:08, Sergey Nuyanzin wrote:
> > >>> Thanks for driving this Timo
> > >>>
> > >>> The FLIP seems reasonable to me
> > >>>
> > >>> I have one minor question/clarification
> > >>> do I understand it correct that after this FLIP we can execute of
> > >>> `typeof` against  result of `OBJECT_OF`
> > >>> for instance
> > >>> SELECT typeof(OBJECT_OF(
> > >>>     'com.example.User',
> > >>>     'name', 'Bob',
> > >>>     'age', 42
> > >>> ));
> > >>>
> > >>> should return `STRUCTURED<'com.example.User', name STRING, age INT>`
> > >>> ?
> > >>>
> > >>> On Tue, Apr 22, 2025 at 10:57 AM Timo Walther <twal...@apache.org>
> > >> wrote:
> > >>>>
> > >>>> Hi everyone,
> > >>>>
> > >>>> I would like to ask again for feedback on this FLIP. It is a rather
> > >>>> small change but with big impact on usability for structured data.
> > >>>>
> > >>>> Are there any objections? Otherwise I would like to continue with
> > voting
> > >>>> soon.
> > >>>>
> > >>>> Thanks,
> > >>>> Timo
> > >>>>
> > >>>> On 10.04.25 07:54, Timo Walther wrote:
> > >>>>> Hi everyone,
> > >>>>>
> > >>>>> I would like to start a discussion about FLIP-520: Simplify
> > >>>>> StructuredType handling [1].
> > >>>>>
> > >>>>> Flink SQL already supports structured types in the engine,
> > serializers,
> > >>>>> UDFs, and connector interfaces. However, currently only Table API
> was
> > >>>>> able to make use of them. While UDFs can take objects as input and
> > >>>>> return types, it is actually quite inconvenient to use them in
> > >>>>> transformations.
> > >>>>>
> > >>>>> This FLIP fixes some immediate blockers in the use of structured
> > types.
> > >>>>>
> > >>>>> Looking forward to feedback.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Timo
> > >>>>>
> > >>>>>
> > >>>>> [1] https://cwiki.apache.org/confluence/display/FLINK/
> > >>>>> FLIP-520%3A+Simplify+StructuredType+handling
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>

Reply via email to