[ 
https://issues.apache.org/jira/browse/AVRO-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17442534#comment-17442534
 ] 

Lu Litao edited comment on AVRO-3248 at 11/12/21, 2:00 AM:
-----------------------------------------------------------

Sorry, i missied your message, somehow it did't send the notification by email  
:P

I checked and ur solution seems to be wrong if we have a union like this 

["A", "B"]

where A and B are like this -- 

```
        // A and B are the same except the name.
        let schema_str_1 = r#"{
            "name": "A",
            "type": "record",
            "fields": [
                {"name": "field_one", "type": "float"}
            ]
        }"#;

        let schema_str_2 = r#"{
            "name": "B",
            "type": "record",
            "fields": [
                {"name": "field_one", "type": "float"}
            ]
        }"#;
```

in this case, the schema found by a value will always be "A"'s schema.

this also applies to A, B being "fixed" type, which seems plausible in reallife 
(for example `A = {"name": "fieldA", "type": "fixed", "size": 10}, B = {"name": 
"fieldB", "type": "fixed", "size": 10}`)

here is my PR for these scenario . https://github.com/apache/avro/pull/1396

to solve that, we can change `Value::Union` to `(index in the type list, value
it holds)`, Similiar to `Value::Enum`.
this allows us to get Union's inner_schema for named types
directly, without validating the schema.





was (Author: lulitao1997):
Sorry, i missied your message, somehow it did't send the notification by email  
:P

I checked and ur solution seems to be wrong if we have a union like this 

["A", "B"]

where A and B are like this -- 

        // A and B are the same except the name.
        let schema_str_1 = r#"{
            "name": "A",
            "type": "record",
            "fields": [
                {"name": "field_one", "type": "float"}
            ]
        }"#;

        let schema_str_2 = r#"{
            "name": "B",
            "type": "record",
            "fields": [
                {"name": "field_one", "type": "float"}
            ]
        }"#;


in this case, the schema found by a value will always be "A"'s schema.

this also applies to A, B being "fixed" type, which seems plausible in reallife 
(for example A = {"name": "fieldA", "type": "fixed", "size": 10}, B = {"name": 
"fieldB", "type": "fixed", "size": 10})

here is my PR for these scenario . https://github.com/apache/avro/pull/1396

```
to solve that, we can change Value::Union to `(index in the type list, value
it holds)`, Similiar to Value::Enum.
this allows us to get Union's inner_schema for named types
directly, without validating the schema.

```





> Rust: Support named types in UnionSchema
> ----------------------------------------
>
>                 Key: AVRO-3248
>                 URL: https://issues.apache.org/jira/browse/AVRO-3248
>             Project: Apache Avro
>          Issue Type: Improvement
>            Reporter: Lu Litao
>            Assignee: Martin Tzvetanov Grigorov
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> currently there's no support for named type in avro's Union type in rust.
> as stated in this comment of the UnionSchma struct
> {quote}// Used to ensure uniqueness of schema inputs, and provide constant 
> time finding of the
>  // schema index given a value.
>  // **NOTE** that this approach does not work for named types, and will have 
> to be modified
>  // to support that. A simple solution is to also keep a mapping of the names 
> used.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to