Maybe you could use something like SplitRecord, DetectDuplicate,
MergeRecord to get the file how you want it. This would split it into
smaller FlowFiles, check if the record has been seen before and keep only
unique ones and then merge them back into one file.

I'm actually collaborating with someone on a DetectDuplicateRecord
processor so eventually you won't need to break your input into separate
files. This is expected to come out in a later release of NiFi.


Adam

On Wed, Feb 20, 2019, 1:08 AM Beutel, Maximilian <
[email protected] wrote:

> Hello!
>
>
>
> Since I’m new to Nifi I’m still trying to wrap my head around certain best
> practices.
>
>
>
> My use case is as follows: I need to ingest a list of projects via http
> API. The API returns a list in a format like:
>
>
>
> [
>
> { id: "1", name: "Project A"},
>
> { id: "2",name: "Project B"}
>
> ]
>
>
>
> Now, the problem with this API is, it always returns the _*full*_ list of
> projects, no delta. So if since the last run some more projects are added,
> it will included project id 1, 2 also, like so:
>
>
>
> [
>
> { id: "1", name: "Project A"},
>
> { id: "2",name: "Project B"},
>
> { id: "3",name: "Project C"}
>
> ]
>
>
>
>
>
> I need to insert this list in a database table, and I thought I use nifi.
> My initial flow looked like:
>
>
>
> InvokeHTTP (fetch JSON HTTP response)
> into JoltTransformJSON (make the returned response a bit nicer)
>
> into ConvertJSONToSQL (insert the whole lot into db)
>
>
>
> The issue I have is the insert statement in ConvertJSONToSQL will fail,
> since the database table has a unique key on the project name field and the
> payload I want to insert will always include all projects, including some
> which are already there.
>
>
>
> My question is how to people usually handle such a use case in Nifi? I can
> either think of filtering the API response against the list of already
> existing project names (not sure how), but I would rather do an insert
> ignore or something that just ignores duplicate record errors  - which is
> not supported by ConvertJSONToSQL as far as I’m aware.
>
>
>
> Maybe I’m also approaching this problem from the wrong side, so I would be
> grateful to receive feedback/recommendations.
>
>
>
> Thanks!
>
> Max
>
>
>
> ==============================================================================
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
>
> ==============================================================================
>

Reply via email to