Hi Boris,

I had to solve a very similar problem and ended up solving it by writing a
custom processor. I just quickly wrote an onTrigger method which read in
the contents of the flowfile, parsed the JSON, did the modifications, then
transferred the flowfile out with the updated body. While I was originally
concerned about the performance, but it turns out that was an unfounded
worry for our use case.

Cheers!

-Charlie

On Fri, Sep 21, 2018 at 2:25 PM Boris Tyukin <[email protected]> wrote:

> Hey guys,
>
> I have a flow returning thousands of records from RDBMS and I convert
> returned AVRO to JSON and get something like below:
>
> [
>   {"col1":"value11", "col2":"value21", "col3:"value31"},
>   {"col1":"value12", "col2":"value22", "col3:"value32"},
> ...
> ]
>
> So still a single flowFile. Now I need to wrap every record in array
> around like that (an oversimplified example here):
>
> [
> {"payload":   {"col1":"value11", "col2":"value21", "col3:"value31"},
>   "meta": {"info": "system1", "timestamp":"2010-10-01 12:23:33"}
> }|
> {"payload":    {"col1":"value12", "col2":"value22", "col3:"value32"} ,
>   "meta": {"info": "system1", "timestamp":"2010-10-01 12:23:33"}
> }
> |
> ]
>
> Basically, I want to
> 1) remove root level [] and replace a comma with a pipe (See below why)
> 2) keep a single flowFile without splitting but wrap source records under
> payload dictionary and adding another dictionary meta with some attributes.
> 3) do not want to define schema upfront because it might change in future
>
> I put pipe because I then want to publish these records to Kafka, using
> demarcation option - it works much faster for me than splitting avro/json
> into individual flowfiles.
>
> Thanks for any ideas,
> Boris
>
>
>
>
>
>
>
>

Reply via email to