[
https://issues.apache.org/jira/browse/FLINK-29267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603589#comment-17603589
]
Timo Walther commented on FLINK-29267:
--------------------------------------
>From Henrik Feldt
>(https://apache-flink.slack.com/archives/C03G7LJTS2G/p1662912163987969)
{code}
Remember to spec for complex types, too
Like geometry
Just so there's an escape hatch
maybe instead of "external", name the value "connector native data type"
and for pg you could also consider what pg extensions need loading before these
are available
and instead of "flink field name" consider naming it "flink table catalog
column name" or whichever is more specific
you also have this https://www.postgresql.org/docs/current/sql-createtype.html
finally, it might be useful to make the serialiser/deserialiser work on byte
arrays instead of strings
otherwise you need to bring in localisation and string representation for
certain types
I ended up creating a pg view as a workaround
Also, for JDBC you don't need to go as far as types.
You can just let me cast the column in pg-specific code:
.column("app_id", DataTypes.STRING().notNull(), "CAST(id AS text)")
as the third parameter; then you can just push that through the config.
another variant, for JDBC, just open up an escape hatch:
https://stackoverflow.com/questions/56265904/reading-uuid-from-result-set-in-postgres-jdbc
https://crafted-software.blogspot.com/2013/03/uuid-values-from-jdbc-to-postgres.html
So basically, if you let the user type a column as DataTypes.OBJECT<UUID>(), x
-> (UUID)x.getObject('column_name'))
{code}
> Support external type systems in DDL
> ------------------------------------
>
> Key: FLINK-29267
> URL: https://issues.apache.org/jira/browse/FLINK-29267
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / JDBC, Formats (JSON, Avro, Parquet, ORC,
> SequenceFile), Table SQL / Ecosystem
> Reporter: Timo Walther
> Assignee: Timo Walther
> Priority: Major
>
> Many connectors and formats require supporting external data types. Postgres
> users request UUID support, Avro users require enum support, etc.
> FLINK-19869 implemented support for Postgres UUIDs poorly and even impacts
> performance with regular strings.
> The long-term solution should be user-defined types in Flink. This is however
> a bigger effort that requires a FLIP and a bigger amount of resources.
> As a mid-term solution, we should offer a consistent approach based on DDL
> options that allows to define a mapping from Flink type system to the
> external type system. I suggest the following:
> {code}
> CREATE TABLE MyTable (
> ...
> ) WITH(
> 'mapping.data-types' = '<Flink field name>: <External field data type>'
> )
> {code}
> The mapping defines a map from Flink data type to external data type. The
> external data type should be string parsable. This works for most connectors
> and formats (e.g. Avro schema string).
> Examples:
> {code}
> CREATE TABLE MyTable (
> regular_col STRING,
> uuid_col STRING,
> point_col ARRAY<DOUBLE>,
> box_col ARRAY<ARRAY<DOUBLE>>
> ) WITH(
> 'mapping.data-types' = 'uuid_col: uuid, point_col: point, box_col: box'
> )
> {code}
> We provide a table of supported mapping data types. E.g. the {{point}} type
> is always maped to {{ARRAY<DOUBLE>}}. In general we choose a data type in
> Flink that comes closest to the required functionality.
> Future work:
> In theory, we can also offer mapping of field names. It might be a
> requirement that Flink's column name is different from the external system's
> one.
> {code}
> CREATE TABLE MyTable (
> ...
> ) WITH(
> 'mapping.names' = '<Flink field name>: <External field name>'
> )
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)