[
https://issues.apache.org/jira/browse/NIFI-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nathan Gough reassigned NIFI-10685:
-----------------------------------
Assignee: Nathan Gough
> PutDatabaseRecord Upsert does not quote columns in conflict clause (postgres)
> -----------------------------------------------------------------------------
>
> Key: NIFI-10685
> URL: https://issues.apache.org/jira/browse/NIFI-10685
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.18.0
> Reporter: mgerbig
> Assignee: Nathan Gough
> Priority: Major
> Attachments: putdatabase-test.avro
>
>
> Nifi's PutDatabaseRecord does not quote columns in the conflict clause when
> using PostgreSQLDatabaseAdapter. This prevents the usage of upsert (and
> possibly insert ignore) statements as soon as at least one column of the
> primary key contains at least one uppercase character.
> h2. Root Cause
> Postgres lowercases all unquoted keys, as described in the manual, which
> causes this bug:
> {quote}Quoting an identifier also makes it case-sensitive, whereas unquoted
> names are always folded to lower case. For example, the identifiers
> {{{}FOO{}}}, {{{}foo{}}}, and {{"foo"}} are considered the same by
> PostgreSQL, but {{"Foo"}} and {{"FOO"}} are different from these three and
> each other. (The folding of unquoted names to lower case in PostgreSQL is
> incompatible with the SQL standard, which says that unquoted names should be
> folded to upper case. Thus, {{foo}} should be equivalent to {{"FOO"}} not
> {{"foo"}} according to the standard. If you want to write portable
> applications you are advised to always quote a particular name or never quote
> it.)
> {quote}
> see
> [https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS]
> h2. Reproduce behaviour
> Create a table in your *PostgreSQL* database
> {code:java}
> CREATE TABLE test_table
> (
> "TIME_PERIOD" date NOT NULL,
> "R0102" numeric,
> "R0203" numeric,
> "R0304" numeric,
> "R0405" numeric,
> "R0506" numeric,
> "R0607" numeric,
> "R0708" numeric,
> "R0809" numeric,
> "R0910" numeric,
> modified_at_utc date DEFAULT timezone('UTC'::text, CURRENT_TIMESTAMP(0)),
> CONSTRAINT test_table PRIMARY KEY ("TIME_PERIOD")
> ) {code}
> Use attached avro to write to the table using a PutDatabaseRecord Processor
> configured with UPSERT.
> Nifi will throw following Exception in the Bulletin Board
> {code:java}
> PutDatabaseRecord[id=...] Failed to put Records to database for
> FlowFile[filename=redacted.avro]. Routing to failure.:
> java.sql.BatchUpdateException: Batch entry 0 INSERT INTO
> "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", "R0405",
> "R0506", "R0607", "R0708", "R0809", "R0910") VALUES ('2022-10-21 +00',
> '2.97'::numeric, '3.2'::numeric, '3.19'::numeric, '3.19'::numeric,
> '3.22'::numeric, '3.33'::numeric, '3.33'::numeric, '3.4'::numeric,
> '3.38'::numeric) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD",
> "R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809",
> "R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203",
> EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607",
> EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") was aborted: ERROR:
> column "time_period" does not exist
> Position: 347 Call getNextException to see other errors in the batch.
> - Caused by: org.postgresql.util.PSQLException: ERROR: column "time_period"
> does not exist
> Position: 347 {code}
> h3. Generated SQL Statement
> Nifi generates and issues following SQL Statement to the database:
> {code:java}
> INSERT INTO "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304",
> "R0405", "R0506", "R0607", "R0708", "R0809", "R0910") VALUES (?, ?, ?, ?, ?,
> ?, ?, ?, ?, ?) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD",
> "R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809",
> "R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203",
> EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607",
> EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") {code}
> This statement only works with quoted _TIME_PERIOD_ in the _ON CONFLICT_
> clause due to the behaviour of postgres described in Section _Root Cause_
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)