[ 
https://issues.apache.org/jira/browse/NIFI-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085929#comment-16085929
 ] 

ASF GitHub Bot commented on NIFI-4181:
--------------------------------------

Github user Wesley-Lawrence commented on the issue:

    https://github.com/apache/nifi/pull/2003
  
    @markap14 Good points. I actually considered adding a schema registry at 
first, but thought it might be too much for just making CSV in/out easier.
    
    I think using schemas falls into two categories. Case A schemas are ones 
that someone uses all the time. Having a schema registry is great then, because 
it's consistently defined in a single place. Case B schemas are one-off 
scheams. Input or output is in some format a person doesn't typically use, and 
someone is just converting to-or-from a Case A often-used schema. In this case, 
the "Schema Text" property becomes really useful. Rather than cluttering your 
registry, you can just define the Case B on-off in a processor/service, and 
never think about it again.
    
    To your point, I've only solved Case B, for CSV in/out. By adding a 
"ExplicitFieldSchemaRegistry" or "ColumnSchemaRegistry" (I think whatever this 
gets named, it's going to be ugly =P) we can tackle Case A for any type.
    
    I think to do this completely properly, we should solve this for Case A and 
B, for any type.
    
    So I personally think we should add a new schema registry for Case A, but I 
think we could also add some "Schema Format" property (defaulting to Avro) to 
`SchemaRegistryService` that informs processors how to interpret "Schema Text". 
That way, it's also easy to define Case B one-off schemas of any type.


> CSVReader and CSVRecordSetWriter services should be able to work given an 
> explicit list of columns.
> ---------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4181
>                 URL: https://issues.apache.org/jira/browse/NIFI-4181
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Wesley L Lawrence
>            Priority: Minor
>         Attachments: NIFI-4181.patch
>
>
> Currently, to read or write a CSV file with *Record processors, the CSVReader 
> and CSVRecordSetWriters need to be given an avro schema. For CSV, a simple 
> column definition can also work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to