[
https://issues.apache.org/jira/browse/AVRO-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239667#comment-17239667
]
Ryan Skraba commented on AVRO-2380:
-----------------------------------
Hello! In my opinion, [~wmaiouiru]'s interpretation is entirely correct, and
in line with the spec!
A logical type isn't a new type, it's a different way of looking at (or
interpreting) an existing type, and any language or SDK could potentially
ignore one or all logical types and still interoperate. A good example is the
"duration" logical type -- if I remember correctly, the annotation is entirely
ignored in the Java world.
This is good news because it lowers the bar to creating a new SDK or adding a
new logical type. To guarantee interoperability with almost every existing
version and language, you only need to support the known primitives.
I guess the workaround would be to force the union to "disambiguate" between
the types doing something like this:
{code}
{
"type" : "record",
"name" : "Avro2380",
"fields" : [ {
"name" : "data",
"type" : {
"type" : "map",
"values" : [ "boolean", "int", "long", "float", {
"type" : "record",
"name" : "TimestampMicros",
"fields" : [ {
"name" : "ts",
"type" : {
"type" : "long",
"logicalType" : "time-micros"
}
} ]
} ]
}
} ]
}
{code}
You can define the TimestampMicros externally and reuse it if you want. As
long as the Schema.Parser already knows about that record, you can create
unions like {{["boolean", "int", "long", "float", "TimestampMicros"]}}.
The obvious tradeoff is that you need to have a record to hold that one datum
now. On the positive side, this adds absolutely zero size to the binary
representation.
> Logical types are not supported when the actual datatype is also present in
> Union
> ---------------------------------------------------------------------------------
>
> Key: AVRO-2380
> URL: https://issues.apache.org/jira/browse/AVRO-2380
> Project: Apache Avro
> Issue Type: Bug
> Affects Versions: 1.8.2
> Reporter: Harshvardhan Agrawal
> Priority: Major
>
> {code:java}
> {
> "name" : "data",
> "type" : {
> "type" : "map",
> "values" : {
> "type" : "array",
> "items" : {
> "type" : "map",
> "values" : [ "string", "long", "null", "double", "float", "int", "boolean"
> ]
> }
> }
> }{code}
> The above schema represents a Map<String, List<Map<String, Object>>> where
> the values of the inner map could be any of the basic datatypes supported by
> Avro.
> If we want the Map to support logical types such as timestamp for e.g.
> {code:java}
> {
> "name" : "data",
> "type" : {
> "type" : "map",
> "values" : {
> "type" : "array",
> "items" : {
> "type" : "map",
> "values" : [ "string", "long", "null", "double", "float", "int", "boolean",
> { "type" : "long", "logicalType" : "timestamp-micros" }]
> }
> }
> }{code}
> The schema parser fails with the following error:
>
> {code:java}
> Exception in thread "main" org.apache.avro.AvroRuntimeException: Duplicate in
> union:long at org.apache.avro.Schema$UnionSchema.<init>(Schema.java:854)
> at org.apache.avro.Schema.parse(Schema.java:1341)
> at org.apache.avro.Schema.parse(Schema.java:1311)
> at org.apache.avro.Schema.parse(Schema.java:1306)
> at org.apache.avro.Schema.parse(Schema.java:1311)
> at org.apache.avro.Schema.parse(Schema.java:1269)
> at org.apache.avro.Schema$Parser.parse(Schema.java:1032)
> at org.apache.avro.Schema$Parser.parse(Schema.java:1020)
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)