[
https://issues.apache.org/jira/browse/AVRO-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089535#comment-17089535
]
Elliot West commented on AVRO-248:
----------------------------------
I see this issue is quite old, but I am wondering if there would be any
interest in adding this to the specification and implementing it? Specifically,
I'm thinking about this kind of construct as described previously by [~cutting]:
{code:java}
{
"type": "union",
"name": "Foo",
"branches": [
"string",
"Bar",
...
]
}{code}
The reason I ask is that I believe that there are new use-cases that could
greatly benefit from this feature, specifically those that currently require
[multi-typed streams in
Kafka|https://www.confluent.io/blog/put-several-event-types-kafka-topic/] or
indeed any streaming platform. There is already [an alternative implementation
for
this|https://github.com/confluentinc/schema-registry/pull/680#issuecomment-511796090]
for this functionality, but this sits outside of Avro and in my opinion a
sub-optimal work-around with [a number of significant
issues|https://github.com/confluentinc/schema-registry/pull/680#issuecomment-511796090].
I would suggest that by implementing this feature in Avro, we can fully satisfy
multi-typed stream use-cases in a clean, simple, and elegant manner, without
needing to build out external implementations that attempt to work around this
absent Avro feature.
> make unions a named type
> ------------------------
>
> Key: AVRO-248
> URL: https://issues.apache.org/jira/browse/AVRO-248
> Project: Apache Avro
> Issue Type: New Feature
> Components: spec
> Reporter: Doug Cutting
> Priority: Major
>
> Unions are currently anonymous. However it might be convenient if they were
> named. In particular:
> - when code is generated for a union, a class could be generated that
> includes an enum indicating which branch of the union is taken, e.g., a union
> of string and int named Foo might cause a Java class like {code}
> public class Foo {
> public static enum Type {STRING, INT};
> private Type type;
> private Object datum;
> public Type getType();
> public String getString() { if (type==STRING) return (String)datum; else
> throw ... }
> public void setString(String s) { type = STRING; datum = s; }
> ....
> }
> {code} Then Java applications can easily use a switch statement to process
> union values rather than using instanceof.
> - when using reflection, an abstract class with a set of concrete
> implementations can be represented as a union (AVRO-241). However, if one
> wishes to create an array one must know the name of the base class, which is
> not represented in the Avro schema. One approach would be to add an
> annotation to the reflected array schema (AVRO-242) noting the base class.
> But if the union itself were named, that could name the base class. This
> would also make reflected protocol interfaces more consise, since the base
> class name could be used in parameters return types and fields.
> - Generalizing the above: Avro lacks class inheritance, unions are a way to
> model inheritance, and this model is more useful if the union is named.
> This would be an incompatible change to schemas. If we go this way, we
> should probably rename 1.3 to 2.0. Note that AVRO-160 proposes an
> incompatible change to data file formats, which may also force a major
> release.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)