[
https://issues.apache.org/jira/browse/AVRO-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611026#comment-17611026
]
Rik Heijdens edited comment on AVRO-3631 at 9/29/22 1:02 PM:
-------------------------------------------------------------
Okay, so I'm starting to understand the issue a bit more, and I added a few
more test-cases to the branch that I [linked
earlier|https://github.com/apache/avro/compare/master...privacy-com:avro:avro-3631/fix-fixed-serialization?expand=1].
Unlike what I initially thought, the compatibility problems with `Value::Fixed`
do not appear to be isolated to serialization. It also affects Deserialization
of a `Value::Record` wrapping a `Value::Fixed` into a Rust struct. I added a
test-case in
[12ef14b|https://github.com/apache/avro/commit/12ef14b6a5cc102bcc0317251cd37471148d4926]
which illustrates this.
I did note however that this is consistent with the Serialization
implementation: a Rust `[u8; 6]` is serialized into a
`Value::Array<Value::Int>` as illustrated by
[a31fcfc|https://github.com/apache/avro/commit/a31fcfc96493e3180490dec5622cab485bf7cd79].
However, I am unsure as to how we should move forward with this: at
serialization time the Schema information is not available to the Serializer
and thus it wouldn't know if we were expecting to serialize to
`Value::Array<Value::Int>` or `Value::Fixed`. Arguably, both could be okay
depending on what Schema is used to later convert the `Value` into bytes.
I'll ponder on this for a bit, but would appreciate suggestions if you have any
on how we can move forward with this [~mgrigorov]
was (Author: JIRAUSER293264):
Okay, so I'm starting to understand the issue a bit more, and I added a few
more test-cases to the branch that I [linked
earlier|https://github.com/apache/avro/compare/master...privacy-com:avro:avro-3631/fix-fixed-serialization?expand=1].
Unlike what I initially thought, the compatibility problems with `Value::Fixed`
do not appear to be isolated to serialization. It also affects Deserialization
of a `Value::Record` wrapping a `Value::Fixed` into a Rust struct. I added a
test-case in
[12ef14b|https://github.com/apache/avro/commit/12ef14b6a5cc102bcc0317251cd37471148d4926]
which illustrates this.
I did note however that this is consistent with the Serialization
implementation: a Rust `[u8; 6]` is serialized into a
`Value::Array<Value::Int>` as illustrated by
[a31fcfc|https://github.com/apache/avro/commit/a31fcfc96493e3180490dec5622cab485bf7cd79].
However, I am unsure as to how we should move forward with this: at
serialization time the Schema information is not available to the Serializer
and thus it wouldn't know if we were expecting to serialize to
`Value::Array<Value::Int>` or `Value::Fixed`.
I'll ponder on this for a bit, but would appreciate suggestions if you have any
on how we can move forward with this [~mgrigorov]
> Fix serialization of structs containing Fixed fields
> ----------------------------------------------------
>
> Key: AVRO-3631
> URL: https://issues.apache.org/jira/browse/AVRO-3631
> Project: Apache Avro
> Issue Type: Bug
> Components: rust
> Reporter: Rik Heijdens
> Priority: Major
>
> Consider the following minimal Avro Schema:
> {noformat}
> {
> "type": "record",
> "name": "TestStructFixedField",
> "fields": [
> {
> "name": "field",
> "type": {
> "name": "field",
> "type": "fixed",
> "size": 6
> }
> }
> ]
> }
> {noformat}
> In Rust, I might represent this schema with the following struct:
> {noformat}
> #[derive(Debug, Serialize, Deserialize)]
> struct TestStructFixedField {
> field: [u8; 6]
> }
> {noformat}
> I would then expect to be able to use `apache_avro::to_avro_datum()` to
> convert an instance of `TestStructFixedField` into an `Vec<u8>` using an
> instance of `Schema` initialized from the schema listed above.
> However, this fails because the `Value` produced by `apache_avro::to_value()`
> represents `field` as an `Value::Array<Value::Int>` rather than a
> `Value::Fixed<6, Vec<u8>` which does not pass schema validation.
> I believe that there are two options to fix this:
> 1. Allow Value::Array<Vec<Value::Int>> to pass validation if the array has
> the expected length, and none of the contents of the array are out-of-range
> for u8. If we go down this route, the implementation of `to_avro_datum()`
> will have to take care of converting Value::Int to u8 when converting into
> bytes.
> 2. Update `apache_avro::to_value()` such that fixed length arrays are
> converted into `Value::Fixed<N, Vec<u8>>` rather than `Value::Array`.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)