Sergey Shcherbakov created NIFI-6934:
----------------------------------------
Summary: Support Postgres 9.5+ Upsert in the standard
PutDatabaseRecord processor
Key: NIFI-6934
URL: https://issues.apache.org/jira/browse/NIFI-6934
Project: Apache NiFi
Issue Type: Improvement
Components: Extensions
Affects Versions: 1.10.0
Reporter: Sergey Shcherbakov
The PutDatabaseRecord processing is setup for batch record processing for more
performant records processing and to be able to use JDBC batch API.
Unfortunately, in this setup, in case if a single record in the batch fails to
be inserted into the target SQL database, the entire transaction with the full
DML batch gets rolled back.
That is, a single failure (e.g. because of a duplicate existing in the database
with the same unique business key constraint) makes using PutDatabaseRecord
processor unusable in batch mode.
A common workaround for that is inserting a SplitRecord before the
PutDatabaseProcessor. That approach works but obviously has disadvantages (slow
performance, higher resource consumption, downstream logic changes, since the
number of FlowFiles explodes).
The PostgresSQL starting with version 9.5 support special SQL syntax extension
that effectively allows UPSERTS, that is, adding new records in case of no
constraint conflicts and updating existing record (or doing nothing) in case if
there is a conflict:
[http://www.postgresqltutorial.com/postgresql-upsert/]
Such an Upsert would solve the above mentioned problems of the
PutDatabaseRecord processor.
Adding support for such extension looks also fairly trivial in the
PutDatabaseRecord processor.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)