[
https://issues.apache.org/jira/browse/AVRO-3460?focusedWorklogId=747630&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-747630
]
ASF GitHub Bot logged work on AVRO-3460:
----------------------------------------
Author: ASF GitHub Bot
Created on: 25/Mar/22 06:57
Start Date: 25/Mar/22 06:57
Worklog Time Spent: 10m
Work Description: martin-g commented on a change in pull request #1620:
URL: https://github.com/apache/avro/pull/1620#discussion_r834999139
##########
File path: lang/rust/avro/src/types.rs
##########
@@ -333,8 +333,15 @@ impl Value {
/// See the [Avro
specification](https://avro.apache.org/docs/current/spec.html)
/// for the full set of rules of schema validation.
pub fn validate(&self, schema: &Schema) -> bool {
+ let rs = ResolvedSchema::try_from(schema).expect("Schema didn't
successfully parse");
+ self.validate_internal(schema, rs.get_names())
+ }
+
+ fn validate_internal(&self, schema: &Schema, names: &NamesRef) -> bool {
Review comment:
I think this would be definitely an improvement!
Few more ideas:
1) the Java SDK recently gained support for `avropath` in the error messages
([AVRO-3375](https://issues.apache.org/jira/browse/AVRO-3375)
2) collect as much errors as possible while validating, not just one at a
time [AVRO-3429](https://issues.apache.org/jira/browse/AVRO-3429)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 747630)
Time Spent: 1h (was: 50m)
> [rust] Value::validate does not validate against Schema Refs
> ------------------------------------------------------------
>
> Key: AVRO-3460
> URL: https://issues.apache.org/jira/browse/AVRO-3460
> Project: Apache Avro
> Issue Type: Bug
> Components: rust
> Reporter: Jack Klamer
> Assignee: Martin Tzvetanov Grigorov
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.11.1, 1.12.0
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> The Value::validate() method is a preprocessing step before encoding the
> value into the buffer. The encode function assumes the value to be valid. The
> validate method recurses as necessary to validate against the schema
> provided. The function does not recurse to validate against schema refs found
> in the schema provided meaning that all values are validated to true against
> all schema Ref's which is incorrect.
> {code}
> pub fn validate(&self, schema: &Schema) -> bool {
> match (self, schema) {
> (_, &Schema::Ref { name: _ }) => true,
> ....{code}
> In practice this failure requires a user to forget to change their Schema
> when changing their struct and that struct change happens to be the second
> use of a struct within the enclosing struct. (Or fixed or enum).
> Because encode assumes the values passed to it are valid, it is possible for
> the encoding to complete without error in some situations.
>
>
> Two tests that should pass and currently fail, useful for anyone interested
> {code}
> #[test]
> fn test_validation_with_refs() {
> let schema = Schema::parse_str(
> r#"
> {
> "type":"record",
> "name":"TestStruct",
> "fields": [
> {
> "name":"a",
> "type":{
> "type":"record",
> "name": "Inner",
> "fields": [ {
> "name":"z",
> "type":"int"
> }]
> }
> },
> {
> "name":"b",
> "type":"Inner"
> }
> ]
> }"#,
> )
> .unwrap();
> let inner_value_right = Value::Record(vec![("z".into(),
> Value::Int(3))]);
> let inner_value_wrong1 = Value::Record(vec![("z".into(),
> Value::Null)]);
> let inner_value_wrong2 = Value::Record(vec![("a".into(),
> Value::String("testing".into()))]);
> let outer1 = Value::Record(vec![
> ("a".into(), inner_value_right.clone()),
> ("b".into(), inner_value_wrong1),
> ]);
> let outer2 = Value::Record(vec![
> ("a".into(), inner_value_right),
> ("b".into(), inner_value_wrong2),
> ]);
> assert!(!outer1.validate(&schema), "field b record is invalid against
> the schema"); // this should pass, but doesn't
> assert!(!outer2.validate(&schema), "field b record is invalid against
> the schema"); // this should pass, but doesn't
> }
> #[derive(Serialize, Clone)]
> struct TestInner {
> z: i32
> }
> #[derive(Serialize)]
> struct TestRefSchemaStruct1 {
> a: TestInner,
> b: String // could be literally anything
> }
> #[derive(Serialize)]
> struct TestRefSchemaStruct2 {
> a: TestInner,
> b: i32 // could be literally anything
> }
> #[derive(Serialize)]
> struct TestRefSchemaStruct3 {
> a: TestInner,
> b: Option<TestInner> // could be literally anything
> }
> #[test]
> fn test_validation_with_refs_real_struct() {
> let schema = Schema::parse_str(
> r#"
> {
> "type":"record",
> "name":"TestStruct",
> "fields": [
> {
> "name":"a",
> "type":{
> "type":"record",
> "name": "Inner",
> "fields": [ {
> "name":"z",
> "type":"int"
> }]
> }
> },
> {
> "name":"b",
> "type":"Inner"
> }
> ]
> }"#,
> )
> .unwrap();
> let test_inner = TestInner{z:3};
> let test_outer1 = TestRefSchemaStruct1{a:test_inner.clone(),
> b:"testing".into()};
> let test_outer2 = TestRefSchemaStruct2{a:test_inner.clone(), b:24};
> let test_outer3 = TestRefSchemaStruct3{a:test_inner.clone(), b:None};
>
> use crate::ser::Serializer;
> let mut ser = Serializer::default();
> let test_outer1: Value = test_outer1.serialize(&mut ser ).unwrap();
> let mut ser = Serializer::default();
> let test_outer2: Value = test_outer2.serialize(&mut ser ).unwrap();
> let mut ser = Serializer::default();
> let test_outer3: Value = test_outer3.serialize(&mut ser ).unwrap();
> assert!(!test_outer1.validate(&schema), "field b record is invalid
> against the schema"); // this should pass, but doesn't
> assert!(!test_outer2.validate(&schema), "field b record is invalid
> against the schema"); // this should pass, but doesn't
> assert!(!test_outer3.validate(&schema), "field b record is invalid
> against the schema"); // this should pass, but doesn't
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)