[
https://issues.apache.org/jira/browse/NIFI-4181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085929#comment-16085929
]
ASF GitHub Bot commented on NIFI-4181:
--------------------------------------
Github user Wesley-Lawrence commented on the issue:
https://github.com/apache/nifi/pull/2003
@markap14 Good points. I actually considered adding a schema registry at
first, but thought it might be too much for just making CSV in/out easier.
I think using schemas falls into two categories. Case A schemas are ones
that someone uses all the time. Having a schema registry is great then, because
it's consistently defined in a single place. Case B schemas are one-off
scheams. Input or output is in some format a person doesn't typically use, and
someone is just converting to-or-from a Case A often-used schema. In this case,
the "Schema Text" property becomes really useful. Rather than cluttering your
registry, you can just define the Case B on-off in a processor/service, and
never think about it again.
To your point, I've only solved Case B, for CSV in/out. By adding a
"ExplicitFieldSchemaRegistry" or "ColumnSchemaRegistry" (I think whatever this
gets named, it's going to be ugly =P) we can tackle Case A for any type.
I think to do this completely properly, we should solve this for Case A and
B, for any type.
So I personally think we should add a new schema registry for Case A, but I
think we could also add some "Schema Format" property (defaulting to Avro) to
`SchemaRegistryService` that informs processors how to interpret "Schema Text".
That way, it's also easy to define Case B one-off schemas of any type.
> CSVReader and CSVRecordSetWriter services should be able to work given an
> explicit list of columns.
> ---------------------------------------------------------------------------------------------------
>
> Key: NIFI-4181
> URL: https://issues.apache.org/jira/browse/NIFI-4181
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Wesley L Lawrence
> Priority: Minor
> Attachments: NIFI-4181.patch
>
>
> Currently, to read or write a CSV file with *Record processors, the CSVReader
> and CSVRecordSetWriters need to be given an avro schema. For CSV, a simple
> column definition can also work.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)