The easiest way to do this would be to create a UNIQUE constraint on the project name and just send one insert at a time. Then each individual failed insert will get routed to failure.
For the sake of safety here, if you have multiple flows that feed into a common SQL ingest point, you might want to put this functional on a separate flow so that you can assume "failure = duplicate according to the UNIQUE constraint." On Wed, Feb 20, 2019 at 7:56 AM Adam Fisher <[email protected]> wrote: > Maybe you could use something like SplitRecord, DetectDuplicate, > MergeRecord to get the file how you want it. This would split it into > smaller FlowFiles, check if the record has been seen before and keep only > unique ones and then merge them back into one file. > > I'm actually collaborating with someone on a DetectDuplicateRecord > processor so eventually you won't need to break your input into separate > files. This is expected to come out in a later release of NiFi. > > > Adam > > On Wed, Feb 20, 2019, 1:08 AM Beutel, Maximilian < > [email protected] wrote: > >> Hello! >> >> >> >> Since I’m new to Nifi I’m still trying to wrap my head around certain >> best practices. >> >> >> >> My use case is as follows: I need to ingest a list of projects via http >> API. The API returns a list in a format like: >> >> >> >> [ >> >> { id: "1", name: "Project A"}, >> >> { id: "2",name: "Project B"} >> >> ] >> >> >> >> Now, the problem with this API is, it always returns the _*full*_ list >> of projects, no delta. So if since the last run some more projects are >> added, it will included project id 1, 2 also, like so: >> >> >> >> [ >> >> { id: "1", name: "Project A"}, >> >> { id: "2",name: "Project B"}, >> >> { id: "3",name: "Project C"} >> >> ] >> >> >> >> >> >> I need to insert this list in a database table, and I thought I use nifi. >> My initial flow looked like: >> >> >> >> InvokeHTTP (fetch JSON HTTP response) >> into JoltTransformJSON (make the returned response a bit nicer) >> >> into ConvertJSONToSQL (insert the whole lot into db) >> >> >> >> The issue I have is the insert statement in ConvertJSONToSQL will fail, >> since the database table has a unique key on the project name field and the >> payload I want to insert will always include all projects, including some >> which are already there. >> >> >> >> My question is how to people usually handle such a use case in Nifi? I >> can either think of filtering the API response against the list of already >> existing project names (not sure how), but I would rather do an insert >> ignore or something that just ignores duplicate record errors - which is >> not supported by ConvertJSONToSQL as far as I’m aware. >> >> >> >> Maybe I’m also approaching this problem from the wrong side, so I would >> be grateful to receive feedback/recommendations. >> >> >> >> Thanks! >> >> Max >> >> >> >> ============================================================================== >> Please access the attached hyperlink for an important electronic >> communications disclaimer: >> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html >> >> ============================================================================== >> >
