Hi Everyone,

I am new to NiFi and community :)

I am trying to build a Nifi flow which will pull from Oracle table and load
into Postgres table. My select query has two columns and I need to remove
duplicates based on these two columns. Can I remove duplicates in Nifi
based on two column data values. My flow is like below -
ExecuteSQL -> split avro -> avrotojson -> jsontosql -> PutSQL


PutSQL question : Oracle table has ~ 4 million records and when the PutSQL
was running , it gave several similar errors :

"Failed to update database due to failed batch update. There were total of
1 FlowFiles that failed, 5 that successful, and 9 that were not execute and
will be routed to retry"

Why might be wrong in PutSQL ? have kept PutSQL batch size of 1000 and
don't have any primary key constraint on postgres table.
(Should I create primary key with those two columns, so while loading it
can reject duplicate records, but will it rejects the complete batch rather
than just duplicates ?)

Would be great if someone can provide insight in this scenario ?

Thanks,
Vikram

Reply via email to