+1 for FlattenRecord as well. In the meantime you can use ExecuteScript or InvokeScriptedProcessor, I have a Groovy script (albeit for a different product) that does the flatten [1].
Regards, Matt [1] http://funpdi.blogspot.com/2014/10/flatten-json-to-key-value-pairs-in-pdi.html On Fri, Sep 15, 2017 at 9:33 AM, Kevin Doran <[email protected]> wrote: > +1 for adding a FlattenRecord processor. I can think of a few scenarios in > which it would be quite useful, and it would be convenient if it could be > accomplished without JOLT. > > Thanks, > Kevin > > On 9/15/17, 09:16, "Nicholas Hughes" <[email protected] on behalf of > [email protected]> wrote: > > Mark, > > I'm definitely for making the processor as generic as possible. I don't > mind chaining together a few simple processors to get a job done (such as > convert JSON to Avro > infer schema > flatten records)... I just don't > want > steps get super complex... and the Jolt Transform processor does seem very > powerful and very complex. > > If there's some support for a "FlattenRecord" processor, I can submit the > Jira containing the meat of this thread. > > -Nick > > > On Fri, Sep 15, 2017 at 9:01 AM, Mark Payne <[email protected]> wrote: > > > Nick, > > > > I do believe that there's a way to do what you're asking with Jolt, > > without knowing any kind of schema. > > That said, Jolt can get complex pretty quickly and I don't know it well > > :) Personally, I have no problem with having a > > FlattenRecord processor. I guess the question here, though, is are you > > using Record-oriented processors, > > or are you using JSON-specific processors? > > > > Personally, I'd like to see a FlattenRecord processor, rather than > > FlattenJSON, because that would allow > > the transformation to apply to Avro as well (and as soon as we get an > XML > > reader built, XML also). However, > > the Record-oriented processors would expect that a schema be given > (though > > it could also be inferred using > > another existing processor). > > > > -Mark > > > > > > > > > On Sep 15, 2017, at 7:43 AM, Nicholas Hughes < > > [email protected]> wrote: > > > > > > Is there an easy way to "flatten" arbitrary JSON within NiFi? > > > > > > For input data like that shown below from Yahoo [1] > > > > > > { > > > "query": { > > > "count": 1, > > > "created": "2017-09-15T11:20:26Z", > > > "lang": "en-US", > > > "results": { > > > "channel": { > > > "item": { > > > "condition": { > > > "code": "33", > > > "date": "Fri, 15 Sep 2017 06:00 AM EDT", > > > "temp": "63", > > > "text": "Mostly Clear" > > > } > > > } > > > } > > > } > > > } > > > } > > > > > > > > > ...I'd like to end up with output something like this: > > > > > > { > > > "query.count": 1, > > > "query.created": "2017-09-15T11:20:26Z", > > > "query.lang": "en-US", > > > "query.results.channel.item.condition.code": "33", > > > "query.results.channel.item.condition.date": "Fri, 15 Sep 2017 06:00 > > AM EDT", > > > "query.results.channel.item.condition.temp": "63", > > > "query.results.channel.item.condition.text": "Mostly Clear" > > > } > > > > > > > > > I checked out the JoltTransformJSON processor and some examples, such > as > > > the nested data to "prefix soup" demo [2], but it seems as though I > need > > to > > > enter information about the schema for the incoming data in order to > > > transform it. Ideally, I'd like to have a processor "just figure it > out" > > > without explicit entry of a schema. > > > > > > Is there any way to accomplish this in a generic way with > > JoltTransformJSON > > > (or another native processor)? > > > > > > If not, would a ticket requesting a "Field Flattener" processor much > like > > > the one included in StreamSets Data Collector [3] be worthwhile? > > > > > > Thanks in advance! > > > > > > -Nick > > > > > > > > > [1] > > > https://query.yahooapis.com/v1/public/yql?q=select%20item. > > condition%20from%20weather.forecast%20where%20woeid%20% > > 3D%202383558&format=json&env=store%3A%2F%2Fdatatables.org% > > 2Falltableswithkeys > > > > > > [2] http://jolt-demo.appspot.com/#bucketToPrefixSoup > > > > > > [3] > > > https://github.com/streamsets/datacollector/tree/master/ > > basic-lib/src/main/java/com/streamsets/pipeline/stage/ > > processor/fieldflattener > > > > > > >
