Hi Boris, I had to solve a very similar problem and ended up solving it by writing a custom processor. I just quickly wrote an onTrigger method which read in the contents of the flowfile, parsed the JSON, did the modifications, then transferred the flowfile out with the updated body. While I was originally concerned about the performance, but it turns out that was an unfounded worry for our use case.
Cheers! -Charlie On Fri, Sep 21, 2018 at 2:25 PM Boris Tyukin <[email protected]> wrote: > Hey guys, > > I have a flow returning thousands of records from RDBMS and I convert > returned AVRO to JSON and get something like below: > > [ > {"col1":"value11", "col2":"value21", "col3:"value31"}, > {"col1":"value12", "col2":"value22", "col3:"value32"}, > ... > ] > > So still a single flowFile. Now I need to wrap every record in array > around like that (an oversimplified example here): > > [ > {"payload": {"col1":"value11", "col2":"value21", "col3:"value31"}, > "meta": {"info": "system1", "timestamp":"2010-10-01 12:23:33"} > }| > {"payload": {"col1":"value12", "col2":"value22", "col3:"value32"} , > "meta": {"info": "system1", "timestamp":"2010-10-01 12:23:33"} > } > | > ] > > Basically, I want to > 1) remove root level [] and replace a comma with a pipe (See below why) > 2) keep a single flowFile without splitting but wrap source records under > payload dictionary and adding another dictionary meta with some attributes. > 3) do not want to define schema upfront because it might change in future > > I put pipe because I then want to publish these records to Kafka, using > demarcation option - it works much faster for me than splitting avro/json > into individual flowfiles. > > Thanks for any ideas, > Boris > > > > > > > >
