Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2425#discussion_r164487280
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateRecord.java
 ---
    @@ -242,11 +279,12 @@ public void onTrigger(final ProcessContext context, 
final ProcessSession session
             final boolean allowExtraFields = 
context.getProperty(ALLOW_EXTRA_FIELDS).asBoolean();
             final boolean strictTypeChecking = 
context.getProperty(STRICT_TYPE_CHECKING).asBoolean();
     
    -        RecordSetWriter validWriter = null;
    -        RecordSetWriter invalidWriter = null;
             FlowFile validFlowFile = null;
             FlowFile invalidFlowFile = null;
     
    +        final List<Record> validRecords = new LinkedList<>();
    --- End diff --
    
    We need to be sure that we are not storing collections of records in heap 
but rather that we are writing them out in a streaming fashion. One of the 
goals of the record API is to allow arbitrarily large FlowFiles that are made 
up of small records. So if we have a 1 GB CSV file, for instance, this would 
result in OutOfMemoryError's very quickly.


---

Reply via email to