Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/2425#discussion_r168057832 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateRecord.java --- @@ -242,11 +279,12 @@ public void onTrigger(final ProcessContext context, final ProcessSession session final boolean allowExtraFields = context.getProperty(ALLOW_EXTRA_FIELDS).asBoolean(); final boolean strictTypeChecking = context.getProperty(STRICT_TYPE_CHECKING).asBoolean(); - RecordSetWriter validWriter = null; - RecordSetWriter invalidWriter = null; FlowFile validFlowFile = null; FlowFile invalidFlowFile = null; + final List<Record> validRecords = new LinkedList<>(); --- End diff -- Hi @martin-mucha Let me try to answer your question. @markap14 will correct me if I'm wrong :) [ValidateRecord.completeFlowFile](https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ValidateRecord.java#L408) method calls `writer.finishRecordSet()`, which let the writer to write the ending mark of record set, as some record format requires this, e.g. JSON '}' or XML '</root>' would be easy to imagine. Actual bytes for record contents had been written in advance. I'd recommend reading [NiFi in depth, Content Repository](https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#content-repository) on how NiFi reads/writes FlowFile content in streaming manner without loading whole content on heap. If you're interested in reading code, [StandardProcessSession.write](https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/StandardProcessSession.java#L2433) might be a good starting point for how FlowFile and its OutputStream is created. And the OutputStream is passed to RecordSetWriter implementations. For example, when a processor writes a record, then it is sent to a method of a configured RecordSetWriter like this, [WriteCSVResult.writeRecord](https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/WriteCSVResult.java#L147). These RecordSetWriter does not hold contents on heap. They write records in streaming manner. If we create a List and hold `Record` instances, then we keep content on heap as `Record` instances which can lead to a OOM. Hope this helps!
---