mosermw commented on code in PR #8362:
URL: https://github.com/apache/nifi/pull/8362#discussion_r1887515269


##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -231,7 +243,11 @@ protected Collection<ValidationResult> 
customValidate(ValidationContext context)
         }
         // If no Expression Language is present, try parsing the schema
         try {
-            this.parseSchema(schema);
+            if (schema != null) {
+                this.parseSchema(schema);
+            } else if (!headerProp.asBoolean()) {
+                throw(new Exception("Schema cannot be empty if header is 
false."));

Review Comment:
   To be clear we are talking about the Header property and not the existence 
of a CSV header, suggest:
   ```suggestion
                   throw(new Exception("Schema cannot be empty if Header 
property is false."));
   ```



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestValidateCsv.java:
##########


Review Comment:
   Please add a unit test where the CSV Source Attribute is defined but is 
missing from the flowfile.  Also verify that flowfile content is not modified 
when validating an attribute.



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -98,6 +97,7 @@ public class ValidateCsv extends AbstractProcessor {
             "Strlen", "StrMinMax", "StrNotNullOrEmpty", "StrRegEx", "Unique", 
"UniqueHashCode", "IsIncludedIn"
     );
 
+

Review Comment:
   This extra blank line doesn't appear to be necessary.



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -204,6 +214,7 @@ public class ValidateCsv extends AbstractProcessor {
             .description("FlowFiles that are not valid according to the 
specified schema are routed to this relationship")
             .build();
 
+

Review Comment:
   This extra blank line doesn't appear to be necessary.



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -449,155 +465,147 @@ public void onTrigger(final ProcessContext context, 
final ProcessSession session
         final CsvPreference csvPref = getPreference(context, flowFile);
         final boolean header = context.getProperty(HEADER).asBoolean();
         final ComponentLog logger = getLogger();
-        final String schema = 
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
-        final CellProcessor[] cellProcs = this.parseSchema(schema);
-        final boolean isWholeFFValidation = 
context.getProperty(VALIDATION_STRATEGY).getValue().equals(VALIDATE_WHOLE_FLOWFILE.getValue());
+        String schema = 
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
+        CellProcessor[] cellProcs = null;
+        if (schema != null) {
+            cellProcs = this.parseSchema(schema);
+        }
+        final String validationStrategy = 
context.getProperty(VALIDATION_STRATEGY).getValue();
+        final boolean isWholeFFValidation = 
!validationStrategy.equals(VALIDATE_LINES_INDIVIDUALLY.getValue());
         final boolean includeAllViolations = 
context.getProperty(INCLUDE_ALL_VIOLATIONS).asBoolean();
 
-        final AtomicReference<Boolean> valid = new AtomicReference<>(true);
-        final AtomicReference<Boolean> isFirstLineValid = new 
AtomicReference<>(true);
-        final AtomicReference<Boolean> isFirstLineInvalid = new 
AtomicReference<>(true);
-        final AtomicReference<Integer> okCount = new AtomicReference<>(0);
-        final AtomicReference<Integer> totalCount = new AtomicReference<>(0);
-        final AtomicReference<FlowFile> invalidFF = new 
AtomicReference<>(null);
-        final AtomicReference<FlowFile> validFF = new AtomicReference<>(null);
-        final AtomicReference<String> validationError = new 
AtomicReference<>(null);
+        boolean valid = true;
+        int okCount = 0;
+        int totalCount = 0;
+        FlowFile invalidFF = null;
+        FlowFile validFF = null;
+        String validationError = null;
+        final AtomicReference<Boolean> isFirstLineValid = new 
AtomicReference<Boolean>(true);
+        final AtomicReference<Boolean> isFirstLineInvalid = new 
AtomicReference<Boolean>(true);

Review Comment:
   A recent commit used diamond notation for the constructor and I suspect your 
rebase didn't pick this up.  This is especially important since new PMD rules 
were recently committed that checks for this.
   ```suggestion
           final AtomicReference<Boolean> isFirstLineValid = new 
AtomicReference<>(true);
           final AtomicReference<Boolean> isFirstLineInvalid = new 
AtomicReference<>(true);
   ```



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestValidateCsv.java:
##########


Review Comment:
   Please add a unit test to verify flowfile goes to "valid" when validating an 
attribute and the attribute value is missing or empty and a schema is defined.  
This explicitly defines expected behavior when there is no CSV data to 
validate.  Also verify that flowfile content is not modified when validating an 
attribute.



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########


Review Comment:
   I think the new feature to validate an attribute deserves a mention in the 
CapabilityDescription:
   "Validates the contents of FlowFiles or a FlowFile attribute value against a 
user-specified CSV schema."



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateCsv.java:
##########
@@ -449,155 +465,147 @@ public void onTrigger(final ProcessContext context, 
final ProcessSession session
         final CsvPreference csvPref = getPreference(context, flowFile);
         final boolean header = context.getProperty(HEADER).asBoolean();
         final ComponentLog logger = getLogger();
-        final String schema = 
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
-        final CellProcessor[] cellProcs = this.parseSchema(schema);
-        final boolean isWholeFFValidation = 
context.getProperty(VALIDATION_STRATEGY).getValue().equals(VALIDATE_WHOLE_FLOWFILE.getValue());
+        String schema = 
context.getProperty(SCHEMA).evaluateAttributeExpressions(flowFile).getValue();
+        CellProcessor[] cellProcs = null;
+        if (schema != null) {
+            cellProcs = this.parseSchema(schema);
+        }
+        final String validationStrategy = 
context.getProperty(VALIDATION_STRATEGY).getValue();
+        final boolean isWholeFFValidation = 
!validationStrategy.equals(VALIDATE_LINES_INDIVIDUALLY.getValue());
         final boolean includeAllViolations = 
context.getProperty(INCLUDE_ALL_VIOLATIONS).asBoolean();
 
-        final AtomicReference<Boolean> valid = new AtomicReference<>(true);
-        final AtomicReference<Boolean> isFirstLineValid = new 
AtomicReference<>(true);
-        final AtomicReference<Boolean> isFirstLineInvalid = new 
AtomicReference<>(true);
-        final AtomicReference<Integer> okCount = new AtomicReference<>(0);
-        final AtomicReference<Integer> totalCount = new AtomicReference<>(0);
-        final AtomicReference<FlowFile> invalidFF = new 
AtomicReference<>(null);
-        final AtomicReference<FlowFile> validFF = new AtomicReference<>(null);
-        final AtomicReference<String> validationError = new 
AtomicReference<>(null);
+        boolean valid = true;
+        int okCount = 0;
+        int totalCount = 0;
+        FlowFile invalidFF = null;
+        FlowFile validFF = null;
+        String validationError = null;
+        final AtomicReference<Boolean> isFirstLineValid = new 
AtomicReference<Boolean>(true);
+        final AtomicReference<Boolean> isFirstLineInvalid = new 
AtomicReference<Boolean>(true);
 
         if (!isWholeFFValidation) {
-            invalidFF.set(session.create(flowFile));
-            validFF.set(session.create(flowFile));
+            invalidFF = session.create(flowFile);
+            validFF = session.create(flowFile);
+        }
+
+        InputStream stream;
+        if (context.getProperty(CSV_SOURCE_ATTRIBUTE).isSet()) {
+            stream = new 
ByteArrayInputStream(flowFile.getAttribute(context.getProperty(CSV_SOURCE_ATTRIBUTE).evaluateAttributeExpressions().getValue()).getBytes(StandardCharsets.UTF_8));

Review Comment:
   The call to flowFile.getAttribute() can return null if the attribute does 
not exist on the flowfile.  If that happens, this line will cause a 
NullPointerException.



##########
nifi-extension-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestValidateCsv.java:
##########


Review Comment:
   Please add a unit test to verify a 0 byte flowfile goes to "valid" when 
validating flowfile content and a schema is defined.  This explicitly defines 
expected behavior when there is no CSV data to validate.  This doesn't test any 
changes you made in this PR but would really help fill out this unit test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to