[ 
https://issues.apache.org/jira/browse/FLINK-25962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Skraba updated FLINK-25962:
--------------------------------
    Description: 
Flink currently generates Avro schemas as records with the top-level name 
{{"record"}}

Unfortunately, there is some inconsistency between Avro implementations in 
different languages that may prevent this record from being read, notably 
Python, which generates the error:
*avro.schema.SchemaParseException: record is a reserved type name*
(See this comment for the full stack trace).

The Java SDK accepts this name, and there's an [ongoing 
discussion|https://lists.apache.org/thread/0wmgyx6z69gy07lvj9ndko75752b8cn2] 
about what the expected behaviour should be.  This should be clarified and 
fixed in Avro, of course.

Regardless of the resolution, the best practice (which is used almost 
everywhere else in the Flink codebase) is to explicitly specify a top-level 
namespace for an Avro record.   We should use a default like: 
{{{}org.apache.flink.avro.generated{}}}.

  was:
Flink currently generates Avro schemas as records with the top-level name 
{{"record"}}

Unfortunately, there is some inconsistency between Avro implementations in 
different languages that may prevent this record from being read, notably 
Python, which generates the error:
avro.schema.SchemaParseException: record is a reserved type name
(See this comment for the full stack trace).

The Java SDK accepts this name, and there's an [ongoing 
discussion|https://lists.apache.org/thread/0wmgyx6z69gy07lvj9ndko75752b8cn2] 
about what the expected behaviour should be.  This should be clarified and 
fixed in Avro, of course.

Regardless of the resolution, the best practice (which is used almost 
everywhere else in the Flink codebase) is to explicitly specify a top-level 
namespace for an Avro record.   We should use a default like: 
{{{}org.apache.flink.avro.generated{}}}.


> Flink generated Avro schemas can't be parsed using Python
> ---------------------------------------------------------
>
>                 Key: FLINK-25962
>                 URL: https://issues.apache.org/jira/browse/FLINK-25962
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.14.3
>            Reporter: Ryan Skraba
>            Priority: Major
>
> Flink currently generates Avro schemas as records with the top-level name 
> {{"record"}}
> Unfortunately, there is some inconsistency between Avro implementations in 
> different languages that may prevent this record from being read, notably 
> Python, which generates the error:
> *avro.schema.SchemaParseException: record is a reserved type name*
> (See this comment for the full stack trace).
> The Java SDK accepts this name, and there's an [ongoing 
> discussion|https://lists.apache.org/thread/0wmgyx6z69gy07lvj9ndko75752b8cn2] 
> about what the expected behaviour should be.  This should be clarified and 
> fixed in Avro, of course.
> Regardless of the resolution, the best practice (which is used almost 
> everywhere else in the Flink codebase) is to explicitly specify a top-level 
> namespace for an Avro record.   We should use a default like: 
> {{{}org.apache.flink.avro.generated{}}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to