[ 
https://issues.apache.org/jira/browse/NIFI-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Thomsen resolved NIFI-4109.
--------------------------------
    Resolution: Won't Do

Not sure we need this now.

> Implement an InferRecordSchema processor
> ----------------------------------------
>
>                 Key: NIFI-4109
>                 URL: https://issues.apache.org/jira/browse/NIFI-4109
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Matt Burgess
>            Priority: Major
>
> Currently a record schema (for use in record-aware processors) must be 
> provided by an attribute, a Schema Registry, or embedded in the flow file, 
> and thus determined ahead of time. For formats that do not carry a schema 
> (CSV, JSON, e.g.) and for flows whose files' schemas vary or are otherwise 
> not known a priori, it would be helpful to have a processor to be able to 
> infer the schema from the content. It could have any/all of the following 
> features:
> - Record-awareness: The existing InferAvroSchema can be used for CSV and JSON 
> with non-record-aware processors/flows, although it does not currently 
> support Avro logical types such as timestamp (see NIFI-3000). The benefit of 
> record-awareness means better inference can be made by inspecting each record 
> in a flowfile.
> - Type inference: Should include the primitive types (numeric, string) as 
> well as more complex types supported by Avro schemas (time, date, timestamp, 
> etc.)
> - Generate Schema in attribute: Recommend "avro.schema" be used as the output 
> attribute, as this is the default for most RecordWriters.
> - Publish Schema to Registry: This is an advanced feature that could be split 
> out into its own Jira due to scope concerns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to