[ 
https://issues.apache.org/jira/browse/NIFI-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Gough reassigned NIFI-10685:
-----------------------------------

    Assignee: Nathan Gough

> PutDatabaseRecord Upsert does not quote columns in conflict clause (postgres)
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-10685
>                 URL: https://issues.apache.org/jira/browse/NIFI-10685
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.18.0
>            Reporter: mgerbig
>            Assignee: Nathan Gough
>            Priority: Major
>         Attachments: putdatabase-test.avro
>
>
> Nifi's PutDatabaseRecord does not quote columns in the conflict clause when 
> using PostgreSQLDatabaseAdapter. This prevents the usage of upsert (and 
> possibly insert ignore) statements as soon as at least one column of the 
> primary key contains at least one uppercase character.
> h2. Root Cause
> Postgres lowercases all unquoted keys, as described in the manual, which 
> causes this bug:
> {quote}Quoting an identifier also makes it case-sensitive, whereas unquoted 
> names are always folded to lower case. For example, the identifiers 
> {{{}FOO{}}}, {{{}foo{}}}, and {{"foo"}} are considered the same by 
> PostgreSQL, but {{"Foo"}} and {{"FOO"}} are different from these three and 
> each other. (The folding of unquoted names to lower case in PostgreSQL is 
> incompatible with the SQL standard, which says that unquoted names should be 
> folded to upper case. Thus, {{foo}} should be equivalent to {{"FOO"}} not 
> {{"foo"}} according to the standard. If you want to write portable 
> applications you are advised to always quote a particular name or never quote 
> it.)
> {quote}
> see 
> [https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS]
> h2. Reproduce behaviour
> Create a table in your *PostgreSQL* database
> {code:java}
> CREATE TABLE test_table
> (
>     "TIME_PERIOD" date NOT NULL,
>     "R0102" numeric,
>     "R0203" numeric,
>     "R0304" numeric,
>     "R0405" numeric,
>     "R0506" numeric,
>     "R0607" numeric,
>     "R0708" numeric,
>     "R0809" numeric,
>     "R0910" numeric,
>     modified_at_utc date DEFAULT timezone('UTC'::text, CURRENT_TIMESTAMP(0)),
>     CONSTRAINT test_table PRIMARY KEY ("TIME_PERIOD")
> ) {code}
> Use attached avro to write to the table using a PutDatabaseRecord Processor 
> configured with UPSERT.
> Nifi will throw following Exception in the Bulletin Board
> {code:java}
> PutDatabaseRecord[id=...] Failed to put Records to database for 
> FlowFile[filename=redacted.avro]. Routing to failure.: 
> java.sql.BatchUpdateException: Batch entry 0 INSERT INTO 
> "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", "R0405", 
> "R0506", "R0607", "R0708", "R0809", "R0910") VALUES ('2022-10-21 +00', 
> '2.97'::numeric, '3.2'::numeric, '3.19'::numeric, '3.19'::numeric, 
> '3.22'::numeric, '3.33'::numeric, '3.33'::numeric, '3.4'::numeric, 
> '3.38'::numeric) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", 
> "R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", 
> "R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", 
> EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", 
> EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") was aborted: ERROR: 
> column "time_period" does not exist
>   Position: 347  Call getNextException to see other errors in the batch.
> - Caused by: org.postgresql.util.PSQLException: ERROR: column "time_period" 
> does not exist
>   Position: 347 {code}
> h3. Generated SQL Statement
> Nifi generates and issues following SQL Statement to the database:
> {code:java}
>  INSERT INTO "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", 
> "R0405", "R0506", "R0607", "R0708", "R0809", "R0910") VALUES (?, ?, ?, ?, ?, 
> ?, ?, ?, ?, ?) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", 
> "R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", 
> "R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", 
> EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", 
> EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") {code}
> This statement only works with quoted _TIME_PERIOD_ in the _ON CONFLICT_ 
> clause due to the behaviour of postgres described in Section _Root Cause_
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to