Just adding to what Joey said, there was a previous discussion about
something like this:

http://apache-nifi-developer-list.39713.n7.nabble.com/Looking-for-feedback-on-my-WIP-Design-td13097.html

As far as the Avro schema, here is an example of how to get access to the
schema from a stream call back:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-avro-bundle/nifi-avro-processors/src/main/java/org/apache/nifi/processors/avro/ExtractAvroMetadata.java#L174-L177



On Thu, Dec 29, 2016 at 10:35 AM, Joey Frazee <[email protected]>
wrote:

> Michael, I think you’re right to call this out. I frequently find myself
> stringing together flows with ExecuteScripts (which you should be able to
> use to pull a schema out by creating an Avro DataFileStream from the
> InputStream and then calling getSchema().toString()) or conversions to/from
> JSON and Avro to handle all the scenarios.
>
> I think the heart of the solution shouldn’t just be the addition of an
> output attribute including the schema, but something generic like you
> mention in your (4), especially considering that there are at least 7
> issues [1-7] open for variations on this. Instead of a just a converter
> processor, though, it’d probably be smart to make it some kind of
> controller service so it can be easily exposed to other processors like
> ExecuteSQL and QueryDatabaseTable.
>
> -joey
>
> 1. https://issues.apache.org/jira/browse/NIFI-2743
> 2. https://issues.apache.org/jira/browse/NIFI-1623
> 3. https://issues.apache.org/jira/browse/NIFI-1623
> 4. https://issues.apache.org/jira/browse/NIFI-1702
> 5. https://issues.apache.org/jira/browse/NIFI-1704
> 6. https://issues.apache.org/jira/browse/NIFI-1398
> 7. https://issues.apache.org/jira/browse/NIFI-2725
>
> > On Dec 28, 2016, at 3:18 PM, Knapp, Michael <
> [email protected]> wrote:
> >
> > Nifi Devs,
> >
> > I noticed you have two processors (ExecuteSQL and QueryDatabaseTable)
> that perform SQL select statements and put the results into a flow file.
> While I am not sure what their difference is, I did notice that they both
> produce avro, and the schema is inferred from the result set.  While the
> schema is included in the output file’s contents, I am not sure of any easy
> way to get that from a *StreamCallback.  So I am wondering,
> >
> >
> > 1.       Could we update the processor to support multiple output
> formats?  I think CSV should definitely be supported.  Parquet might also
> be useful for me.  JSON is an option but since you already have a
> ConvertAvroToJSON processor that is not a big deal for me.
> >
> > 2.       Could we update the processor to include the schema as one of
> the output flow file attributes?
> >
> > 3.       Is there any utility to get an avro schema from the input
> stream callback?
> >
> > 4.       Has anybody thought about writing a processor to convert Avro
> to CSV?  Or even something more generic than that, a generic format
> conversion processor?  It could support CSV, JSON, Avro, Parquet, XML, and
> possibly others.
> >
> > Please let me know,
> >
> > Michael Knapp
> > Capital One
> > ________________________________________________________
> >
> > The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
>

Reply via email to