jklamer opened a new pull request #1631:
URL: https://github.com/apache/avro/pull/1631
### Description
This PR is mean to start what should and will be a long review process.
There is outstanding work to be done, but all the major features to ship with I
believe are included.
Outstanding work:
- [ ] Confirm naming of companion trait
- [ ] Restructure the crate to be a feature on top of the normal crate as
started in #1579
- [ ] Extensive README.md and module documentation that looks appropriated
in the docs.crates.io location
- [ ] Unit test inner functions used within Macro
- [ ] Use schema equals assertions in tests module to harden types to help
stop breaking changes accidentally
##### New Features
The two traits defined within schema.rs
```
pub trait AvroSchema {
fn get_schema() -> Schema;
}
pub trait AvroSchemaWithResolved {
fn get_schema_with_resolved(resolved_schemas: &mut Names) -> Schema;
}
```
and a proc macro invoked as either
```
#[derive(AvroSchema)]
struct Test {}
// or
#[derive(AvroSchema)]
#[namespace = "com.testing.namespace"]
struct Test {}
```
##### Reasoning/Desires
The best would be to have `fn get_schema() -> &'static Schema` but I was
unable to figure out how to do that without global state and this solution is
easy to work with to avoid repeated calls to get_schema() which will create the
same valid schema every time. The `AvroSchemaWithResolved` is a companion trait
needed to help resolve cyclic schema dependencies and reuse of named types
across the same struct. The derivation of `AvroSchema` actually derives an
implementation for `AvroSchemaWithResolved` and is converted to a standalone
implementation using a blanket implementation where the schema being returned
is a root schema.
```
impl<T> AvroSchema for T
where
T: AvroSchemaWithResolved,
{
fn get_schema() -> Schema {
T::get_schema_with_resolved(&mut HashMap::default())
}
}
```
##### Desired user workflow
Anything that can be serialized the "The serde way" should be able to be
serialized/deserialized without further configuration. Anything that can be
`#[derive(Serialize, Deserialize)]` should also be able to add `
#[derive(Serialize, Deserialize, AvroSchema)]`
##### Caveats
This means that we are not attacking special cases that make sense for Avro
because they will not work when integrated with the serde code. Biggest example
being `Vec<u8>` is not derived as `Schema::Bytes` because it is serialized as
`Schema::Array` . This might be something we can tightly couple with serde
attributes later.
Types that cannot be both serialized and deserialized accurately are
currently not covered, `char` `u32`, `u64` namely. Special exception for this
rule to non static lifetimed references (They can be serialized and not
deserialized).
##### Current Flow
```
use apache_avro::Schema;
let raw_schema = r#"
{
"type": "record",
"name": "test",
"fields": [
{"name": "a", "type": "long", "default": 42},
{"name": "b", "type": "string"}
]
}
use apache_avro::Writer;
#[derive(Debug, Serialize)]
struct Test {
a: i64,
b: String,
}
// if the schema is not valid, this function will return an error
let schema = Schema::parse_str(raw_schema).unwrap();
let mut writer = Writer::new(&schema, Vec::new());
let test = Test {
a: 27,
b: "foo".to_owned(),
};
writer.append_ser(test).unwrap();
let encoded = writer.into_inner();
```
##### New Flow
```
use apache_avro::Writer;
#[derive(Debug, Serialize, AvroSchema)]
struct Test {
a: i64,
b: String,
}
// derived schema, always valid or code fails to compile with a descriptive
message
let schema = Test::get_schema();
let mut writer = Writer::new(&schema, Vec::new());
let test = Test {
a: 27,
b: "foo".to_owned(),
};
writer.append_ser(test).unwrap();
let encoded = writer.into_inner();
```
##### crate import
To use this functionality it comes as an optional feature (modeled off serde)
cargo.toml
```
apache-avro = { version = "X.Y.Z", features = ["derive"] }
```
### Jira
https://issues.apache.org/jira/browse/AVRO-3479
### Tests
My PR includes many tests to intent to challenge the macro in common and
uncommon use cases.
### Documentation
- [ ] WIP see above
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]