Charlie

You'll absolutely want to look at the Record reader/writer
capabilities.  It will help you convert from the CSV (or similar) to
JSON without having to go through attributes at all.

Take a look here
https://cwiki.apache.org/confluence/display/NIFI/Example+Dataflow+Templates
and you could see the provenance example for configuration.  If you
want to share a sample line of the delimited data and a sample of the
output JSON I can share you back a template that would help you get
started.

Thanks
Joe

On Tue, Sep 19, 2017 at 11:29 AM, Charlie Frasure
<charliefras...@gmail.com> wrote:
> I have a data flow that takes delimited input using GetFile, extracts some
> of that into attributes, converts the attributes to a JSON object, reformats
> the JSON using the Jolt transformer, and then does additional processing
> before using PutFile to move the original file based on the dataflow result.
> I have to work around NiFi to make the last step happen.
>
> I am setting the AttributesToJSON to replace the flowfile content because
> the Jolt transformer requires the JSON object to be in the flowfile content.
> There is no "original" relationship out of AttributesToJSON, so this data
> would be lost.  I have the "Keep Source File" set to true on the GetFile,
> and then use PutFile with the filename to grab it later.
>
> This works for the most part, but under heavy data loads we have some errors
> trying to process a file more than once.
>
> I think we could resolve this by not keeping the source file, sending a
> duplicate of the content down another path and merging later.  I want to
> explore the possibility of either 1) having an "original" relationship
> whenever the previous flowfile content is being modified or replaced, or 2)
> maintaining an "original" flowfile content alongside the working content so
> that it is easily available once the processing is complete.
>
> Am I missing a more direct way to process this data?  Other thoughts?
>
> Thanks,
> Charlie
>
>
>
>

Reply via email to