[ https://issues.apache.org/jira/browse/NIFI-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16292927#comment-16292927 ]
ASF GitHub Bot commented on NIFI-4496: -------------------------------------- Github user mattyb149 commented on a diff in the pull request: https://github.com/apache/nifi/pull/2245#discussion_r157261623 --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/JacksonCSVRecordReader.java --- @@ -136,7 +134,7 @@ public Record nextRecord(final boolean coerceTypes, final boolean dropUnknownFie // If the first record is the header names (and we're using them), store those off for use in creating the value map on the next iterations if (rawFieldNames == null) { - if (hasHeader && ignoreHeader) { + if (!hasHeader || ignoreHeader) { rawFieldNames = schema.getFieldNames(); } else { rawFieldNames = Arrays.stream(csvRecord).map((a) -> { --- End diff -- Who knows lol. I'll try asList() instead > Improve performance of CSVReader > -------------------------------- > > Key: NIFI-4496 > URL: https://issues.apache.org/jira/browse/NIFI-4496 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions > Reporter: Matt Burgess > Assignee: Matt Burgess > > During some throughput testing, it was noted that the CSVReader was not as > fast as desired, processing less than 50k records per second. A look at [this > benchmark|https://github.com/uniVocity/csv-parsers-comparison] implies that > the Apache Commons CSV parser (used by CSVReader) is quite slow compared to > others. > From that benchmark it appears that CSVReader could be enhanced by using a > different CSV parser under the hood. Perhaps Jackson is the best choice, as > it is fast when values are quoted, and is a mature and maintained codebase. -- This message was sent by Atlassian JIRA (v6.4.14#64029)