[ 
https://issues.apache.org/jira/browse/NIFI-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mgerbig updated NIFI-10685:
---------------------------
    Description: 
Nifi's PutDatabaseRecord does not quote columns in the conflict clause when 
using PostgreSQLDatabaseAdapter. This prevents the usage of upsert (and 
possibly insert ignore) statements as soon as at least one column of the 
primary key contains at least one uppercase character.
h2. Root Cause

Postgres lowercases all unquoted keys, as described in the manual, which causes 
this bug:
{quote}Quoting an identifier also makes it case-sensitive, whereas unquoted 
names are always folded to lower case. For example, the identifiers 
{{{}FOO{}}}, {{{}foo{}}}, and {{"foo"}} are considered the same by PostgreSQL, 
but {{"Foo"}} and {{"FOO"}} are different from these three and each other. (The 
folding of unquoted names to lower case in PostgreSQL is incompatible with the 
SQL standard, which says that unquoted names should be folded to upper case. 
Thus, {{foo}} should be equivalent to {{"FOO"}} not {{"foo"}} according to the 
standard. If you want to write portable applications you are advised to always 
quote a particular name or never quote it.)
{quote}
see 
[https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS]
h2. Reproduce behaviour

Create a table in your *PostgreSQL* database
{code:java}
CREATE TABLE test_table
(
    "TIME_PERIOD" date NOT NULL,
    "R0102" numeric,
    "R0203" numeric,
    "R0304" numeric,
    "R0405" numeric,
    "R0506" numeric,
    "R0607" numeric,
    "R0708" numeric,
    "R0809" numeric,
    "R0910" numeric,
    modified_at_utc date DEFAULT timezone('UTC'::text, CURRENT_TIMESTAMP(0)),
    CONSTRAINT test_table PRIMARY KEY ("TIME_PERIOD")
) {code}
Use attached avro to write to the table using a PutDatabaseRecord Processor 
configured with UPSERT.

Nifi will throw following Exception in the Bulletin Board
{code:java}
PutDatabaseRecord[id=...] Failed to put Records to database for 
FlowFile[filename=redacted.avro]. Routing to failure.: 
java.sql.BatchUpdateException: Batch entry 0 INSERT INTO 
"public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", "R0405", 
"R0506", "R0607", "R0708", "R0809", "R0910") VALUES ('2022-10-21 +00', 
'2.97'::numeric, '3.2'::numeric, '3.19'::numeric, '3.19'::numeric, 
'3.22'::numeric, '3.33'::numeric, '3.33'::numeric, '3.4'::numeric, 
'3.38'::numeric) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", 
"R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", 
"R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", 
EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", 
EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") was aborted: ERROR: 
column "time_period" does not exist
  Position: 347  Call getNextException to see other errors in the batch.
- Caused by: org.postgresql.util.PSQLException: ERROR: column "time_period" 
does not exist
  Position: 347 {code}
h3. Generated SQL Statement

Nifi generates and issues following SQL Statement to the database:
{code:java}
 INSERT INTO "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", 
"R0405", "R0506", "R0607", "R0708", "R0809", "R0910") VALUES (?, ?, ?, ?, ?, ?, 
?, ?, ?, ?) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", "R0102", 
"R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", "R0910") = 
(EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", EXCLUDED."R0304", 
EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", EXCLUDED."R0708", 
EXCLUDED."R0809", EXCLUDED."R0910") {code}
This statement only works with quoted _TIME_PERIOD_ in the _ON CONFLICT_ clause 
due to the behaviour of postgres described in Section _Root Cause_

 

  was:
Nifi's PutDatabaseRecord does not quote columns in the conflict clause when 
using PostgreSQLDatabaseAdapter. This prevents the usage of upsert (and 
possibly insert ignore) statements as soon as at least one column of the 
primary key contains at least one uppercase character.{*}{*}
h2. Root Cause

Postgres lowercases all unquoted keys, as described in the manual, which causes 
this bug:
{quote}Quoting an identifier also makes it case-sensitive, whereas unquoted 
names are always folded to lower case. For example, the identifiers 
{{{}FOO{}}}, {{{}foo{}}}, and {{"foo"}} are considered the same by PostgreSQL, 
but {{"Foo"}} and {{"FOO"}} are different from these three and each other. (The 
folding of unquoted names to lower case in PostgreSQL is incompatible with the 
SQL standard, which says that unquoted names should be folded to upper case. 
Thus, {{foo}} should be equivalent to {{"FOO"}} not {{"foo"}} according to the 
standard. If you want to write portable applications you are advised to always 
quote a particular name or never quote it.)
{quote}
see 
[https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS]
h2. Reproduce behaviour

Create a table in your *PostgreSQL* database
{code:java}
CREATE TABLE test_table
(
    "TIME_PERIOD" date NOT NULL,
    "R0102" numeric,
    "R0203" numeric,
    "R0304" numeric,
    "R0405" numeric,
    "R0506" numeric,
    "R0607" numeric,
    "R0708" numeric,
    "R0809" numeric,
    "R0910" numeric,
    modified_at_utc date DEFAULT timezone('UTC'::text, CURRENT_TIMESTAMP(0)),
    CONSTRAINT test_table PRIMARY KEY ("TIME_PERIOD")
) {code}
Use attached avro to write to the table using a PutDatabaseRecord Processor 
configured with UPSERT.

Nifi will throw following Exception in the Bulletin Board
{code:java}
PutDatabaseRecord[id=...] Failed to put Records to database for 
FlowFile[filename=redacted.avro]. Routing to failure.: 
java.sql.BatchUpdateException: Batch entry 0 INSERT INTO 
"public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", "R0405", 
"R0506", "R0607", "R0708", "R0809", "R0910") VALUES ('2022-10-21 +00', 
'2.97'::numeric, '3.2'::numeric, '3.19'::numeric, '3.19'::numeric, 
'3.22'::numeric, '3.33'::numeric, '3.33'::numeric, '3.4'::numeric, 
'3.38'::numeric) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", 
"R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", 
"R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", 
EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", 
EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") was aborted: ERROR: 
column "time_period" does not exist
  Position: 347  Call getNextException to see other errors in the batch.
- Caused by: org.postgresql.util.PSQLException: ERROR: column "time_period" 
does not exist
  Position: 347 {code}
h3. Generated SQL Statement

Nifi generates and issues following SQL Statement to the database:
{code:java}
 INSERT INTO "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", 
"R0405", "R0506", "R0607", "R0708", "R0809", "R0910") VALUES (?, ?, ?, ?, ?, ?, 
?, ?, ?, ?) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", "R0102", 
"R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", "R0910") = 
(EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", EXCLUDED."R0304", 
EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", EXCLUDED."R0708", 
EXCLUDED."R0809", EXCLUDED."R0910") {code}
This statement only works with quoted _TIME_PERIOD_ in the _ON CONFLICT_ clause 
due to the behaviour of postgres described in Section _Root Cause_

 


> PutDatabaseRecord Upsert does not quote columns in conflict clause (postgres)
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-10685
>                 URL: https://issues.apache.org/jira/browse/NIFI-10685
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.18.0
>            Reporter: mgerbig
>            Priority: Major
>         Attachments: putdatabase-test.avro
>
>
> Nifi's PutDatabaseRecord does not quote columns in the conflict clause when 
> using PostgreSQLDatabaseAdapter. This prevents the usage of upsert (and 
> possibly insert ignore) statements as soon as at least one column of the 
> primary key contains at least one uppercase character.
> h2. Root Cause
> Postgres lowercases all unquoted keys, as described in the manual, which 
> causes this bug:
> {quote}Quoting an identifier also makes it case-sensitive, whereas unquoted 
> names are always folded to lower case. For example, the identifiers 
> {{{}FOO{}}}, {{{}foo{}}}, and {{"foo"}} are considered the same by 
> PostgreSQL, but {{"Foo"}} and {{"FOO"}} are different from these three and 
> each other. (The folding of unquoted names to lower case in PostgreSQL is 
> incompatible with the SQL standard, which says that unquoted names should be 
> folded to upper case. Thus, {{foo}} should be equivalent to {{"FOO"}} not 
> {{"foo"}} according to the standard. If you want to write portable 
> applications you are advised to always quote a particular name or never quote 
> it.)
> {quote}
> see 
> [https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS]
> h2. Reproduce behaviour
> Create a table in your *PostgreSQL* database
> {code:java}
> CREATE TABLE test_table
> (
>     "TIME_PERIOD" date NOT NULL,
>     "R0102" numeric,
>     "R0203" numeric,
>     "R0304" numeric,
>     "R0405" numeric,
>     "R0506" numeric,
>     "R0607" numeric,
>     "R0708" numeric,
>     "R0809" numeric,
>     "R0910" numeric,
>     modified_at_utc date DEFAULT timezone('UTC'::text, CURRENT_TIMESTAMP(0)),
>     CONSTRAINT test_table PRIMARY KEY ("TIME_PERIOD")
> ) {code}
> Use attached avro to write to the table using a PutDatabaseRecord Processor 
> configured with UPSERT.
> Nifi will throw following Exception in the Bulletin Board
> {code:java}
> PutDatabaseRecord[id=...] Failed to put Records to database for 
> FlowFile[filename=redacted.avro]. Routing to failure.: 
> java.sql.BatchUpdateException: Batch entry 0 INSERT INTO 
> "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", "R0405", 
> "R0506", "R0607", "R0708", "R0809", "R0910") VALUES ('2022-10-21 +00', 
> '2.97'::numeric, '3.2'::numeric, '3.19'::numeric, '3.19'::numeric, 
> '3.22'::numeric, '3.33'::numeric, '3.33'::numeric, '3.4'::numeric, 
> '3.38'::numeric) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", 
> "R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", 
> "R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", 
> EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", 
> EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") was aborted: ERROR: 
> column "time_period" does not exist
>   Position: 347  Call getNextException to see other errors in the batch.
> - Caused by: org.postgresql.util.PSQLException: ERROR: column "time_period" 
> does not exist
>   Position: 347 {code}
> h3. Generated SQL Statement
> Nifi generates and issues following SQL Statement to the database:
> {code:java}
>  INSERT INTO "public"."test_table"("TIME_PERIOD", "R0102", "R0203", "R0304", 
> "R0405", "R0506", "R0607", "R0708", "R0809", "R0910") VALUES (?, ?, ?, ?, ?, 
> ?, ?, ?, ?, ?) ON CONFLICT (TIME_PERIOD) DO UPDATE SET ("TIME_PERIOD", 
> "R0102", "R0203", "R0304", "R0405", "R0506", "R0607", "R0708", "R0809", 
> "R0910") = (EXCLUDED."TIME_PERIOD", EXCLUDED."R0102", EXCLUDED."R0203", 
> EXCLUDED."R0304", EXCLUDED."R0405", EXCLUDED."R0506", EXCLUDED."R0607", 
> EXCLUDED."R0708", EXCLUDED."R0809", EXCLUDED."R0910") {code}
> This statement only works with quoted _TIME_PERIOD_ in the _ON CONFLICT_ 
> clause due to the behaviour of postgres described in Section _Root Cause_
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to