I want to use Avro to validate data in JSON objects against a schema.
My expectation was that the schema validation process covers the
following scenarios with appropriate error messages:
1. Required field X is missing in the data. Error message something
like "field X not found"
2. Field X has the wrong type. Error message something like "field X
expected String, found Integer"
3. Field Y is in the data but it's not mentioned in the schema. Error
message something like "Unexpected field found: Y"
With the code below, I found that only scenario 1 works as I expected.
Scenario 2 gets a somewhat helpful error message and scenario 3 is not
a failure at all.
Is there anything wrong with my approach?
Lukas
// validation method
void validate(ObjectNode node) {
Schema schema = SchemaBuilder
.record("test")
.fields()
.requiredString("testField")
.endRecord();
String nodeAsString = node.toString();
DatumReader<String> datumReader = new GenericDatumReader<>(schema);
datumReader.read(null, getDecoder(schema, nodeAsString));
}
// scenarios
JsonNodeFactory factory = JsonNodeFactory.instance;
// 1. Required field missing
ObjectNode node = factory.objectNode()
node.put("xyz", "foo");
validate(node) // Result: "Expected field name not found: testField"
// 2. Required field has wrong type
ObjectNode node = factory.objectNode()
node.put("testField", 1);
validate(node) // Result: "Expected string. Got VALUE_NUMBER_INT" (The
name of the field that has the wrong type is not part of the message
which is less helpful if there are multiple fields)
// 3. Extraneous field
ObjectNode node = factory.objectNode()
node.put("testField", "foo");
node.put("xyz", "foo");
validate(node) // There is no error even though the specified JSON
object contains data that the schema does not define