[
https://issues.apache.org/jira/browse/NIFI-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210360#comment-15210360
]
Dmitry Goldenberg commented on NIFI-1280:
-----------------------------------------
Would it make sense to have a SplitCSV processor, with the filter being one of
the options? I think this would be in line with some existing processors such
as SplitAvro, SplitJSON, SplitText, SplitXML. It seems easiest to filter out
unwanted columns right when the CSV file is being read. To that end, it might
even be worth considering adding a GetCSV type of processor but I'd think
SplitCSV looks like a good start here ?
> Create FilterCSVColumns Processor
> ---------------------------------
>
> Key: NIFI-1280
> URL: https://issues.apache.org/jira/browse/NIFI-1280
> Project: Apache NiFi
> Issue Type: Task
> Components: Extensions
> Reporter: Mark Payne
>
> We should have a Processor that allows users to easily filter out specific
> columns from CSV data. For instance, a user would configure two different
> properties: "Columns of Interest" (a comma-separated list of column indexes)
> and "Filtering Strategy" (Keep Only These Columns, Remove Only These Columns).
> We can do this today with ReplaceText, but it is far more difficult than it
> would be with this Processor, as the user has to use Regular Expressions,
> etc. with ReplaceText.
> Eventually a Custom UI could even be built that allows a user to upload a
> Sample CSV and choose which columns from there, similar to the way that Excel
> works when importing CSV by dragging and selecting the desired columns? That
> would certainly be a larger undertaking and would not need to be done for an
> initial implementation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)