mosermw commented on code in PR #8362:
URL: https://github.com/apache/nifi/pull/8362#discussion_r1887515269
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -231,7 +243,11 @@ protected Collection<ValidationResult>
customValidate(ValidationContext context)
}
// If no Expression Language is present, try parsing the schema
try {
- this.parseSchema(schema);
+ if (schema != null) {
+ this.parseSchema(schema);
+ } else if (!headerProp.asBoolean()) {
+ throw(new Exception("Schema cannot be empty if header is
false."));
Review Comment:
To be clear we are talking about the Header property and not the existence
of a CSV header, suggest:
```suggestion
throw(new Exception("Schema cannot be empty if Header
property is false."));
```
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestValidateCsv.java:
##########
Review Comment:
Please add a unit test where the CSV Source Attribute is defined but is
missing from the flowfile. Also verify that flowfile content is not modified
when validating an attribute.
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -98,6 +97,7 @@ public class ValidateCsv extends AbstractProcessor {
"Strlen", "StrMinMax", "StrNotNullOrEmpty", "StrRegEx", "Unique",
"UniqueHashCode", "IsIncludedIn"
);
+
Review Comment:
This extra blank line doesn't appear to be necessary.
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -204,6 +214,7 @@ public class ValidateCsv extends AbstractProcessor {
.description("FlowFiles that are not valid according to the
specified schema are routed to this relationship")
.build();
+
Review Comment:
This extra blank line doesn't appear to be necessary.
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -449,155 +465,147 @@ public void onTrigger(final ProcessContext context,
final ProcessSession session
final CsvPreference csvPref = getPreference(context, flowFile);
final boolean header = context.getProperty(HEADER).asBoolean();
final ComponentLog logger = getLogger();
- final String schema =
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
- final CellProcessor[] cellProcs = this.parseSchema(schema);
- final boolean isWholeFFValidation =
context.getProperty(VALIDATION_STRATEGY).getValue().equals(VALIDATE_WHOLE_FLOWFILE.getValue());
+ String schema =
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
+ CellProcessor[] cellProcs = null;
+ if (schema != null) {
+ cellProcs = this.parseSchema(schema);
+ }
+ final String validationStrategy =
context.getProperty(VALIDATION_STRATEGY).getValue();
+ final boolean isWholeFFValidation =
!validationStrategy.equals(VALIDATE_LINES_INDIVIDUALLY.getValue());
final boolean includeAllViolations =
context.getProperty(INCLUDE_ALL_VIOLATIONS).asBoolean();
- final AtomicReference<Boolean> valid = new AtomicReference<>(true);
- final AtomicReference<Boolean> isFirstLineValid = new
AtomicReference<>(true);
- final AtomicReference<Boolean> isFirstLineInvalid = new
AtomicReference<>(true);
- final AtomicReference<Integer> okCount = new AtomicReference<>(0);
- final AtomicReference<Integer> totalCount = new AtomicReference<>(0);
- final AtomicReference<FlowFile> invalidFF = new
AtomicReference<>(null);
- final AtomicReference<FlowFile> validFF = new AtomicReference<>(null);
- final AtomicReference<String> validationError = new
AtomicReference<>(null);
+ boolean valid = true;
+ int okCount = 0;
+ int totalCount = 0;
+ FlowFile invalidFF = null;
+ FlowFile validFF = null;
+ String validationError = null;
+ final AtomicReference<Boolean> isFirstLineValid = new
AtomicReference<Boolean>(true);
+ final AtomicReference<Boolean> isFirstLineInvalid = new
AtomicReference<Boolean>(true);
Review Comment:
A recent commit used diamond notation for the constructor and I suspect your
rebase didn't pick this up. This is especially important since new PMD rules
were recently committed that checks for this.
```suggestion
final AtomicReference<Boolean> isFirstLineValid = new
AtomicReference<>(true);
final AtomicReference<Boolean> isFirstLineInvalid = new
AtomicReference<>(true);
```
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestValidateCsv.java:
##########
Review Comment:
Please add a unit test to verify flowfile goes to "valid" when validating an
attribute and the attribute value is missing or empty and a schema is defined.
This explicitly defines expected behavior when there is no CSV data to
validate. Also verify that flowfile content is not modified when validating an
attribute.
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
Review Comment:
I think the new feature to validate an attribute deserves a mention in the
CapabilityDescription:
"Validates the contents of FlowFiles or a FlowFile attribute value against a
user-specified CSV schema."
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -449,155 +465,147 @@ public void onTrigger(final ProcessContext context,
final ProcessSession session
final CsvPreference csvPref = getPreference(context, flowFile);
final boolean header = context.getProperty(HEADER).asBoolean();
final ComponentLog logger = getLogger();
- final String schema =
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
- final CellProcessor[] cellProcs = this.parseSchema(schema);
- final boolean isWholeFFValidation =
context.getProperty(VALIDATION_STRATEGY).getValue().equals(VALIDATE_WHOLE_FLOWFILE.getValue());
+ String schema =
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
+ CellProcessor[] cellProcs = null;
+ if (schema != null) {
+ cellProcs = this.parseSchema(schema);
+ }
+ final String validationStrategy =
context.getProperty(VALIDATION_STRATEGY).getValue();
+ final boolean isWholeFFValidation =
!validationStrategy.equals(VALIDATE_LINES_INDIVIDUALLY.getValue());
final boolean includeAllViolations =
context.getProperty(INCLUDE_ALL_VIOLATIONS).asBoolean();
- final AtomicReference<Boolean> valid = new AtomicReference<>(true);
- final AtomicReference<Boolean> isFirstLineValid = new
AtomicReference<>(true);
- final AtomicReference<Boolean> isFirstLineInvalid = new
AtomicReference<>(true);
- final AtomicReference<Integer> okCount = new AtomicReference<>(0);
- final AtomicReference<Integer> totalCount = new AtomicReference<>(0);
- final AtomicReference<FlowFile> invalidFF = new
AtomicReference<>(null);
- final AtomicReference<FlowFile> validFF = new AtomicReference<>(null);
- final AtomicReference<String> validationError = new
AtomicReference<>(null);
+ boolean valid = true;
+ int okCount = 0;
+ int totalCount = 0;
+ FlowFile invalidFF = null;
+ FlowFile validFF = null;
+ String validationError = null;
+ final AtomicReference<Boolean> isFirstLineValid = new
AtomicReference<Boolean>(true);
+ final AtomicReference<Boolean> isFirstLineInvalid = new
AtomicReference<Boolean>(true);
if (!isWholeFFValidation) {
- invalidFF.set(session.create(flowFile));
- validFF.set(session.create(flowFile));
+ invalidFF = session.create(flowFile);
+ validFF = session.create(flowFile);
+ }
+
+ InputStream stream;
+ if (context.getProperty(CSV_SOURCE_ATTRIBUTE).isSet()) {
+ stream = new
ByteArrayInputStream(flowFile.getAttribute(context.getProperty(CSV_SOURCE_ATTRIBUTE).evaluateAttributeExpressions().getValue()).getBytes(StandardCharsets.UTF_8));
Review Comment:
The call to flowFile.getAttribute() can return null if the attribute does
not exist on the flowfile. If that happens, this line will cause a
NullPointerException.
##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestValidateCsv.java:
##########
Review Comment:
Please add a unit test to verify a 0 byte flowfile goes to "valid" when
validating flowfile content and a schema is defined. This explicitly defines
expected behavior when there is no CSV data to validate. This doesn't test any
changes you made in this PR but would really help fill out this unit test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]