PookieBuns opened a new issue, #529: URL: https://github.com/apache/avro-rs/issues/529
I noticed another inconsistent behavior btw while working through schema aware serialization, related to #512 Is it intended to supporting serializing non-union like structs using a union schema (if they are a variant of the union) by doing lookup in the union for the correct branch? ``` // Licensed to the Apache Software Foundation (ASF) under one // or more contributor license agreements. See the NOTICE file // distributed with this work for additional information // regarding copyright ownership. The ASF licenses this file // to you under the Apache License, Version 2.0 (the // "License"); you may not use this file except in compliance // with the License. You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, // software distributed under the License is distributed on an // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY // KIND, either express or implied. See the License for the // specific language governing permissions and limitations // under the License. use apache_avro::Schema; use apache_avro::writer::datum::GenericDatumWriter; use apache_avro_test_helper::TestResult; use pretty_assertions::assert_eq; use serde::Serialize; fn serialize<T: Serialize>(value: &T, schema: &Schema) -> Result<Vec<u8>, apache_avro::Error> { GenericDatumWriter::builder(schema) .build()? .write_ser_to_vec(value) } fn union_schema() -> Schema { Schema::parse_str( r#" [ "int", { "type": "record", "name": "TestRecord", "fields": [ {"name": "a", "type": "int"} ] }, { "type": "enum", "name": "TestEnum", "symbols": ["A", "B"] } ] "#, ) .unwrap() } #[derive(Debug, Serialize)] struct TestRecord { a: i32, } #[allow(dead_code)] #[derive(Debug, Serialize)] enum TestEnum { A, B, } #[test] fn primitive_can_serialize_with_union_schema() -> TestResult { let serialized = serialize(&42_i32, &union_schema())?; assert_eq!(serialized, vec![0, 84]); Ok(()) } #[test] fn struct_can_serialize_with_union_schema() -> TestResult { let serialized = serialize(&TestRecord { a: 27 }, &union_schema())?; assert_eq!(serialized, vec![2, 54]); Ok(()) } #[test] #[should_panic(expected = "Expected Schema::Null | Schema::Record(name: A, fields: []) at index 0 in the union")] fn enum_cannot_serialize_with_union_schema() { let serialized= serialize(&TestEnum::A, &union_schema()).unwrap(); assert_eq!(serialized, vec![4, 0]); } ``` In the example, `i32` and `TestRecord` can be serialized using a `UnionSchema` because its one of the variants of the union https://github.com/apache/avro-rs/blob/016d0f6101ec1326e69345e679d1eadd75735b69/avro/src/serde/ser_schema/mod.rs#L627 If this is intended, then that means theres a bug for enum serialization https://github.com/apache/avro-rs/blob/016d0f6101ec1326e69345e679d1eadd75735b69/avro/src/serde/ser_schema/mod.rs#L429 In `serialize_newtype_variant`. It assumes the (Rust Enum) being serialized is a union holding different branches, which causes it to assume that a unit variant should either be a null or record. To fix this, there would need to be some edge case handling, specifically by using the name to lookup and check if we are serializing a (rust enum thats representing an avro enum), or a (rust enum thats representing an avro union). (Its finicky because rust doesn't different these at the type level). My preference would just to disallow serializing Non union rust representations using a union schema that contains that representation, but that's also a breaking change because I believe previously it was supported, although also sometimes unreliable). If we want to support it, I can create a separate PR to try to support this, although I'm assuming it would create some more complex edge case handling (to get around the fact that rust doesn't differentiate enums and unions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
