[ 
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611026#comment-17611026
 ] 

Rik Heijdens edited comment on AVRO-3631 at 9/29/22 1:02 PM:
-------------------------------------------------------------

Okay, so I'm starting to understand the issue a bit more, and I added a few 
more test-cases to the branch that I [linked 
earlier|https://github.com/apache/avro/compare/master...privacy-com:avro:avro-3631/fix-fixed-serialization?expand=1].

Unlike what I initially thought, the compatibility problems with `Value::Fixed` 
do not appear to be isolated to serialization. It also affects Deserialization 
of a `Value::Record` wrapping a `Value::Fixed` into a Rust struct. I added a 
test-case in 
[12ef14b|https://github.com/apache/avro/commit/12ef14b6a5cc102bcc0317251cd37471148d4926]
 which illustrates this.

I did note however that this is consistent with the Serialization 
implementation: a Rust `[u8; 6]` is serialized into a 
`Value::Array<Value::Int>` as illustrated by 
[a31fcfc|https://github.com/apache/avro/commit/a31fcfc96493e3180490dec5622cab485bf7cd79].

However, I am unsure as to how we should move forward with this: at 
serialization time the Schema information is not available to the Serializer 
and thus it wouldn't know if we were expecting to serialize to 
`Value::Array<Value::Int>` or `Value::Fixed`. Arguably, both could be okay 
depending on what Schema is used to later convert the `Value` into bytes.

I'll ponder on this for a bit, but would appreciate suggestions if you have any 
on how we can move forward with this [~mgrigorov]


was (Author: JIRAUSER293264):
Okay, so I'm starting to understand the issue a bit more, and I added a few 
more test-cases to the branch that I [linked 
earlier|https://github.com/apache/avro/compare/master...privacy-com:avro:avro-3631/fix-fixed-serialization?expand=1].

Unlike what I initially thought, the compatibility problems with `Value::Fixed` 
do not appear to be isolated to serialization. It also affects Deserialization 
of a `Value::Record` wrapping a `Value::Fixed` into a Rust struct. I added a 
test-case in 
[12ef14b|https://github.com/apache/avro/commit/12ef14b6a5cc102bcc0317251cd37471148d4926]
 which illustrates this.

I did note however that this is consistent with the Serialization 
implementation: a Rust `[u8; 6]` is serialized into a 
`Value::Array<Value::Int>` as illustrated by 
[a31fcfc|https://github.com/apache/avro/commit/a31fcfc96493e3180490dec5622cab485bf7cd79].

However, I am unsure as to how we should move forward with this: at 
serialization time the Schema information is not available to the Serializer 
and thus it wouldn't know if we were expecting to serialize to 
`Value::Array<Value::Int>` or `Value::Fixed`.

I'll ponder on this for a bit, but would appreciate suggestions if you have any 
on how we can move forward with this [~mgrigorov]

> Fix serialization of structs containing Fixed fields
> ----------------------------------------------------
>
>                 Key: AVRO-3631
>                 URL: https://issues.apache.org/jira/browse/AVRO-3631
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: rust
>            Reporter: Rik Heijdens
>            Priority: Major
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
>     "type": "record",
>     "name": "TestStructFixedField",
>     "fields": [
>         {
>             "name": "field",
>             "type": {
>                 "name": "field",
>                 "type": "fixed",
>                 "size": 6
>             }
>         }
>     ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
>     field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to 
> convert an instance of `TestStructFixedField` into an `Vec<u8>` using an 
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()` 
> represents `field` as an `Value::Array<Value::Int>` rather than a 
> `Value::Fixed<6, Vec<u8>` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array<Vec<Value::Int>> to pass validation if the array has 
> the expected length, and none of the contents of the array are out-of-range 
> for u8. If we go down this route, the implementation of `to_avro_datum()` 
> will have to take care of converting Value::Int to u8 when converting into 
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are 
> converted into `Value::Fixed<N, Vec<u8>>` rather than `Value::Array`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to