Ha! They are nearly as cool as nifi reading bedtime stories. You have a good point.
I was all happy we were about to make your flow far better/faster/stronger. Then you threw down with HL7. We really need to make an HL7RecordReader then the rest of this would be fast/fun. Any volunteers? Thanks On Tue, Sep 19, 2017 at 2:09 PM, Charlie Frasure <[email protected]> wrote: > Thanks Joe, > > I'm using the HL7 processor to extract HL7v2 data to attributes, then > mapping the attributes to expected JSON entries. I am using the Record > reader/writers elsewhere, definitely the best thing that has happened to > NiFi since bedtime stories [1]. > So my current flow is: > > GetFile (leave original file) -> > ExtractHL7Attributes -> > UpdateAttribute (for light conversions) -> > AttributesToJSON (as flowfile-content) -> > JoltTransformJSON (This could probably be replaced by record readers / > writers) -> > InvokeHTTP (call webservice) -> > FetchFile (using filename attribute) > > There are some additional exception paths, but this flow works as intended > except when the web service can't keep up with new files. I have a delay > built in to GetFile to account for this, which mostly works, but sometimes > we pull the same file more than once. I suppose I could also move the file > to an interim folder to prevent multiple reads. > > Thanks, > Charlie > > > [1] > https://community.hortonworks.com/articles/28380/nifi-ocr-using-apache-nifi-to-read-childrens-books.html > > > On Tue, Sep 19, 2017 at 11:35 AM, Joe Witt <[email protected]> wrote: >> >> Charlie >> >> You'll absolutely want to look at the Record reader/writer >> capabilities. It will help you convert from the CSV (or similar) to >> JSON without having to go through attributes at all. >> >> Take a look here >> >> https://cwiki.apache.org/confluence/display/NIFI/Example+Dataflow+Templates >> and you could see the provenance example for configuration. If you >> want to share a sample line of the delimited data and a sample of the >> output JSON I can share you back a template that would help you get >> started. >> >> Thanks >> Joe >> >> On Tue, Sep 19, 2017 at 11:29 AM, Charlie Frasure >> <[email protected]> wrote: >> > I have a data flow that takes delimited input using GetFile, extracts >> > some >> > of that into attributes, converts the attributes to a JSON object, >> > reformats >> > the JSON using the Jolt transformer, and then does additional processing >> > before using PutFile to move the original file based on the dataflow >> > result. >> > I have to work around NiFi to make the last step happen. >> > >> > I am setting the AttributesToJSON to replace the flowfile content >> > because >> > the Jolt transformer requires the JSON object to be in the flowfile >> > content. >> > There is no "original" relationship out of AttributesToJSON, so this >> > data >> > would be lost. I have the "Keep Source File" set to true on the >> > GetFile, >> > and then use PutFile with the filename to grab it later. >> > >> > This works for the most part, but under heavy data loads we have some >> > errors >> > trying to process a file more than once. >> > >> > I think we could resolve this by not keeping the source file, sending a >> > duplicate of the content down another path and merging later. I want to >> > explore the possibility of either 1) having an "original" relationship >> > whenever the previous flowfile content is being modified or replaced, or >> > 2) >> > maintaining an "original" flowfile content alongside the working content >> > so >> > that it is easily available once the processing is complete. >> > >> > Am I missing a more direct way to process this data? Other thoughts? >> > >> > Thanks, >> > Charlie >> > >> > >> > >> > > >
