martin-g commented on code in PR #3095:
URL: https://github.com/apache/avro/pull/3095#discussion_r1718351789
##########
lang/rust/avro/src/codec.rs:
##########
@@ -49,7 +49,13 @@ pub enum Codec {
/// CRC32 checksum of the uncompressed data in the block.
Snappy,
#[cfg(feature = "zstandard")]
+ /// The `Zstandard` codec uses Facebook's
[Zstandard](https://facebook.github.io/zstd/) with the
+ /// default compression level.
Zstandard,
+ #[cfg(feature = "zstandard")]
+ /// This codec is the same as `Zstandard` but allows specifying the
compression level.
+ #[strum(disabled)]
+ ZstandardWithLevel(ZstandardLevel),
Review Comment:
I think I hit a problem!
By Avro spec when writing the binary data (.avro) the codec is stored in the
header's metadata as a string, e.g. "zstandard". But there is nothing about
codec's metadata!
So, if you do something like :
```
let schema = Schema::parse_str(
r#"
{
"type": "record",
"name": "Test",
"fields": [
{"name": "f1", "type": "int"},
{"name": "f2", "type": "string"}
]
}"#,
)?;
let mut writer = Writer::with_codec(&schema, Vec::new(), codec);
let mut record = Record::new(writer.schema()).unwrap();
record.put("f1", 27_i32);
record.put("f2", "foo");
writer.append(record)?;
let input = writer.into_inner()?;
let mut reader = Reader::new(&input[..])?;
...
```
Then the Reader will not know anything about the codec's settings used by
the writer! The Reader will use the codec with its default settings!
We can store the settings in the metadata's `user_metadata`, see reader.rs >
Block struct. But this will take some more time!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]