[
https://issues.apache.org/jira/browse/FLINK-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126844#comment-17126844
]
Seth Wiesman edited comment on FLINK-18096 at 6/5/20, 2:57 PM:
---------------------------------------------------------------
[~jark] Assume you are creating a table to write out to a Kafka topic. Flink
currently generates an avro schema with no namespace and name row_0. It would
be nice if the outermost record type could have a user-specified name and
namespace. One issue is you cannot create a specific record from this schema
that is usable from Java because you cannot have classes in the default
package. Users may also have requirements for schemas to use specific
namespaces and names internally such as including team names, etc. So the DDL
statement:
{code:sql}
CREATE TABLE orders (
order_id STRING,
amount DOUBLE
) WITH (
'format' = 'avro',
'avro-name' = 'orders',
'avro-namespace' = 'org.mycompany'
);
{code}
Would generate this Avro schema,
{code:json}
{
"type": "record",
"name": "orders",
"namespace": "org.mycompany",
"fields": [
{ "name": "order_id", "type": "string" },
{ "name": "amount", "type": "double" }
]
}
{code}
Instead of this:
{code:json}
{
"type": "record",
"name": "row_0",
"fields": [
{ "name": "order_id", "type": "string" },
{ "name": "amount", "type": "double" }
]
}
{code}
was (Author: sjwiesman):
[~jark] Assume you are creating a table to write out to a Kafka topic. Flink
currently generates an avro schema with no namespace and name row_0. It would
be nice if the outermost record type could have a user-specified name and
namespace. One issue is you cannot create a specific record from this schema
that is usable from Java because you cannot have classes in the default
package. So the DDL statement:
{code:sql}
CREATE TABLE orders (
order_id STRING,
amount DOUBLE
) WITH (
'format' = 'avro',
'avro-name' = 'orders',
'avro-namespace' = 'org.mycompany'
);
{code}
Would generate this Avro schema,
{code:json}
{
"type": "record",
"name": "orders",
"namespace": "org.mycompany",
"fields": [
{ "name": "order_id", "type": "string" },
{ "name": "amount", "type": "double" }
]
}
{code}
Instead of this:
{code:json}
{
"type": "record",
"name": "row_0",
"fields": [
{ "name": "order_id", "type": "string" },
{ "name": "amount", "type": "double" }
]
}
{code}
> Generated avro formats should support user specified name and namespace
> -----------------------------------------------------------------------
>
> Key: FLINK-18096
> URL: https://issues.apache.org/jira/browse/FLINK-18096
> Project: Flink
> Issue Type: Improvement
> Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
> Reporter: Seth Wiesman
> Priority: Major
>
> When avro schema is auto derived it should still be possible to specify
> namespace and name.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)