[
https://issues.apache.org/jira/browse/NIFI-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mike Thomsen resolved NIFI-4109.
--------------------------------
Resolution: Won't Do
Not sure we need this now.
> Implement an InferRecordSchema processor
> ----------------------------------------
>
> Key: NIFI-4109
> URL: https://issues.apache.org/jira/browse/NIFI-4109
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Extensions
> Reporter: Matt Burgess
> Priority: Major
>
> Currently a record schema (for use in record-aware processors) must be
> provided by an attribute, a Schema Registry, or embedded in the flow file,
> and thus determined ahead of time. For formats that do not carry a schema
> (CSV, JSON, e.g.) and for flows whose files' schemas vary or are otherwise
> not known a priori, it would be helpful to have a processor to be able to
> infer the schema from the content. It could have any/all of the following
> features:
> - Record-awareness: The existing InferAvroSchema can be used for CSV and JSON
> with non-record-aware processors/flows, although it does not currently
> support Avro logical types such as timestamp (see NIFI-3000). The benefit of
> record-awareness means better inference can be made by inspecting each record
> in a flowfile.
> - Type inference: Should include the primitive types (numeric, string) as
> well as more complex types supported by Avro schemas (time, date, timestamp,
> etc.)
> - Generate Schema in attribute: Recommend "avro.schema" be used as the output
> attribute, as this is the default for most RecordWriters.
> - Publish Schema to Registry: This is an advanced feature that could be split
> out into its own Jira due to scope concerns.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)