Hello!

Since I'm new to Nifi I'm still trying to wrap my head around certain best 
practices.

My use case is as follows: I need to ingest a list of projects via http API. 
The API returns a list in a format like:

[
{ id: "1", name: "Project A"},
{ id: "2",name: "Project B"}
]

Now, the problem with this API is, it always returns the _full_ list of 
projects, no delta. So if since the last run some more projects are added, it 
will included project id 1, 2 also, like so:

[
{ id: "1", name: "Project A"},
{ id: "2",name: "Project B"},
{ id: "3",name: "Project C"}
]


I need to insert this list in a database table, and I thought I use nifi. My 
initial flow looked like:

InvokeHTTP (fetch JSON HTTP response)
into JoltTransformJSON (make the returned response a bit nicer)
into ConvertJSONToSQL (insert the whole lot into db)

The issue I have is the insert statement in ConvertJSONToSQL will fail, since 
the database table has a unique key on the project name field and the payload I 
want to insert will always include all projects, including some which are 
already there.

My question is how to people usually handle such a use case in Nifi? I can 
either think of filtering the API response against the list of already existing 
project names (not sure how), but I would rather do an insert ignore or 
something that just ignores duplicate record errors  - which is not supported 
by ConvertJSONToSQL as far as I'm aware.

Maybe I'm also approaching this problem from the wrong side, so I would be 
grateful to receive feedback/recommendations.

Thanks!

Max

=============================================================================== 
Please access the attached hyperlink for an important electronic communications 
disclaimer: 
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
=============================================================================== 

Reply via email to