Hello!
Since I'm new to Nifi I'm still trying to wrap my head around certain best
practices.
My use case is as follows: I need to ingest a list of projects via http API.
The API returns a list in a format like:
[
{ id: "1", name: "Project A"},
{ id: "2",name: "Project B"}
]
Now, the problem with this API is, it always returns the _full_ list of
projects, no delta. So if since the last run some more projects are added, it
will included project id 1, 2 also, like so:
[
{ id: "1", name: "Project A"},
{ id: "2",name: "Project B"},
{ id: "3",name: "Project C"}
]
I need to insert this list in a database table, and I thought I use nifi. My
initial flow looked like:
InvokeHTTP (fetch JSON HTTP response)
into JoltTransformJSON (make the returned response a bit nicer)
into ConvertJSONToSQL (insert the whole lot into db)
The issue I have is the insert statement in ConvertJSONToSQL will fail, since
the database table has a unique key on the project name field and the payload I
want to insert will always include all projects, including some which are
already there.
My question is how to people usually handle such a use case in Nifi? I can
either think of filtering the API response against the list of already existing
project names (not sure how), but I would rather do an insert ignore or
something that just ignores duplicate record errors - which is not supported
by ConvertJSONToSQL as far as I'm aware.
Maybe I'm also approaching this problem from the wrong side, so I would be
grateful to receive feedback/recommendations.
Thanks!
Max
===============================================================================
Please access the attached hyperlink for an important electronic communications
disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
===============================================================================