Mark Payne created NIFI-6986:
--------------------------------
Summary: ValidateRecord should optionally validate of nullable
fields are present
Key: NIFI-6986
URL: https://issues.apache.org/jira/browse/NIFI-6986
Project: Apache NiFi
Issue Type: Improvement
Components: Extensions
Reporter: Mark Payne
Currently, if a field is nullable according to the schema, ValidateRecord
considers the record to be valid, even if the field is missing completely. For
some use cases, this is desirable. For example, it is common to drop fields in
JSON when the field's value is null, because it can drastically reduce the size
of the JSON.
However, in other use cases, this is not desirable. For example, in a CSV file,
we may want to require that there are the appropriate number of fields in a
Record. It may be acceptable, for instance to have a line like "1234, John
Smith, , , ," but not to have a line like "1234, John Smith".
ValidateRecord should be updated with a new Property: "Allow Missing Null
Values". If the value is `true` (the default, to avoid changing behavior
between versions), the Processor should behave as it does now, where the
absence of the field is synonymous with a null value. In this case, a line like
"1234, John Smith" would be valid when the CSV is expecting 6 fields, as long
as the last 4 fields are nullable.
But if the value of this new property is `false`, the Processor should require
that all fields be present in the data, even if the field has a null value. In
this case, a line like "1234, John Smith" would be invalid if the CSV were
expected to contain 6 fields.
The `WriteJsonResult` class has a method in it: `private boolean
isFieldPresent(RecordField field, Record record)`. This method should really
exist on `Record` itself with a slightly different signature: `boolean
isFieldPresent(RecordField field)`. It should have a default implementation
provided, akin to the implementation in `WriteJsonResult` and then
`WriteJsonResult` should simply use that method.
`StandardSchemaValidator` should then be updated to use this to validate that
records have all required fields, as configured. `SchemaValidationContext`
should then be updated also to indicate whether or not the presence of null
values should be validated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)