[ 
https://issues.apache.org/jira/browse/NIFI-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shawn Weeks reassigned NIFI-6986:
---------------------------------

    Assignee: Shawn Weeks

> ValidateRecord should optionally validate if nullable fields are present
> ------------------------------------------------------------------------
>
>                 Key: NIFI-6986
>                 URL: https://issues.apache.org/jira/browse/NIFI-6986
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Shawn Weeks
>            Priority: Major
>
> Currently, if a field is nullable according to the schema, ValidateRecord 
> considers the record to be valid, even if the field is missing completely. 
> For some use cases, this is desirable. For example, it is common to drop 
> fields in JSON when the field's value is null, because it can drastically 
> reduce the size of the JSON.
> However, in other use cases, this is not desirable. For example, in a CSV 
> file, we may want to require that there are the appropriate number of fields 
> in a Record. It may be acceptable, for instance to have a line like "1234, 
> John Smith, , , ," but not to have a line like "1234, John Smith".
> ValidateRecord should be updated with a new Property: "Allow Missing Null 
> Values". If the value is `true` (the default, to avoid changing behavior 
> between versions), the Processor should behave as it does now, where the 
> absence of the field is synonymous with a null value. In this case, a line 
> like "1234, John Smith" would be valid when the CSV is expecting 6 fields, as 
> long as the last 4 fields are nullable.
> But if the value of this new property is `false`, the Processor should 
> require that all fields be present in the data, even if the field has a null 
> value. In this case, a line like "1234, John Smith" would be invalid if the 
> CSV were expected to contain 6 fields.
> The `WriteJsonResult` class has a method in it: `private boolean 
> isFieldPresent(RecordField field, Record record)`. This method should really 
> exist on `Record` itself with a slightly different signature: `boolean 
> isFieldPresent(RecordField field)`. It should have a default implementation 
> provided, akin to the implementation in `WriteJsonResult` and then 
> `WriteJsonResult` should simply use that method.
> `StandardSchemaValidator` should then be updated to use this to validate that 
> records have all required fields, as configured. `SchemaValidationContext` 
> should then be updated also to indicate whether or not the presence of null 
> values should be validated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to