[ 
https://issues.apache.org/jira/browse/NIFI-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284959#comment-15284959
 ] 

Toivo Adams commented on NIFI-1280:
-----------------------------------

Julian,

“Does Nifi have parsers for more of the basic file types?”

NiFi itself (core framework) is using FlowFile which is content with 
attributes. Content is just sequence of bytes.
NiFi usually don't care how bytes are used. So its up to processing component 
(called Processor) how to interpret bytes.
Because you can always write new processor, number of different formats is 
unlimited.
Standard processors (included in NiFi distribution) usually support CSV, JSON, 
Avro and few other types. Often there are different processors for different 
types.

Personally I think new NiFi SQLTransform processor should support only most 
common formats – CSV, JSON, Avro, maybe XML and maybe few others. 
Other formats should be handled using converter processors which will convert 
exotic formats to common formats.

Thanks
Toivo


> Create FilterCSVColumns Processor
> ---------------------------------
>
>                 Key: NIFI-1280
>                 URL: https://issues.apache.org/jira/browse/NIFI-1280
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Toivo Adams
>
> We should have a Processor that allows users to easily filter out specific 
> columns from CSV data. For instance, a user would configure two different 
> properties: "Columns of Interest" (a comma-separated list of column indexes) 
> and "Filtering Strategy" (Keep Only These Columns, Remove Only These Columns).
> We can do this today with ReplaceText, but it is far more difficult than it 
> would be with this Processor, as the user has to use Regular Expressions, 
> etc. with ReplaceText.
> Eventually a Custom UI could even be built that allows a user to upload a 
> Sample CSV and choose which columns from there, similar to the way that Excel 
> works when importing CSV by dragging and selecting the desired columns? That 
> would certainly be a larger undertaking and would not need to be done for an 
> initial implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to