[HACKERS] Re: [BUGS] 10.0: Logical replication doesn't execute BEFORE UPDATE OF trigger

2017-10-10 Thread Petr Jelinek
On 10/10/17 17:22, Aleksander Alekseev wrote:
> Hi Petr,
> 
>> let me start by saying that my view is that this is simply a
>> documentation bug. Meaning that I didn't document that it does not work,
>> but I also never intended it to work. Main reason is that we can't
>> support the semantics of "UPDATE OF" correctly (see bellow). And I think
>> it's better to not support something at all rather than making it behave
>> differently in different cases.
>>
>> Now about the proposed patch, I don't think this is correct way to
>> support this as it will only work when either PRIMARY KEY column was
>> changed or when REPLICA IDENTITY is set to FULL for the table. And even
>> then it will have very different semantics from how it works when the
>> row is updated by SQL statement (all non-toasted columns will be
>> reported as changed regardless of actually being changed or not).
>>
>> The more proper way to do this would be to run data comparison of the
>> new tuple and the existing tuple values which a) will have different
>> behavior than normal "UPDATE OF" triggers have because we still can't
>> tell what columns were mentioned in the original query and b) will not
>> exactly help performance (and perhaps c) one can write the trigger to
>> behave this way already without impacting all other use-cases).
> 
> I personally think that solution proposed by Masahiko is much better
> than current behavior.


Well the proposed patch adds inconsistent behavior - if in your test
you'd change the trigger to be UPDATE OF v instead of k and updated v,
it would still not be triggered even with this patch. That's IMHO worse
than current behavior which at least consistently doesn't work.

> We could document current limitations you've
> mentioned and probably add a corresponding warning during execution of
> ALTER TABLE ... ENABLE REPLICA TRIGGER. I don't insist on this
> particular approach though.
> 
> What I really don't like is that PostgreSQL allows to create something
> that supposedly should work but in fact doesn't. Such behavior is
> obviously a bug. So as an alternative we could just return an error in
> response to ALTER TABLE ... ENABLE REPLICA TRIGGER query for triggers
> that we know will never be executed.
> 

Well ENABLE REPLICA is not specific to builtin logical replication
though. It only depends on the value of session_replication_role GUC.
Any session can set that and if that session executes SQL the UPDATE OF
triggers will work as expected. It's similar situation to statement
triggers IMHO, we allow ENABLE REPLICA statement triggers but they are
not triggered by logical replication because there is no statement. And
given that UPDATE OF also depends on statement to work as expected (it's
specified to be triggered for columns listed in the statement, not for
what has been actually changed by the execution) I don't really see the
difference between this and statement triggers except that statement
trigger behavior is documented.

I understand that it's somewhat confusing and I am not saying it's
ideal, but I don't see any other behavior that would work for what your
test tries to do and still be consistent with rest of the system.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] 10.0: Logical replication doesn't execute BEFORE UPDATE OF trigger

2017-10-10 Thread Aleksander Alekseev
Hi Petr,

> let me start by saying that my view is that this is simply a
> documentation bug. Meaning that I didn't document that it does not work,
> but I also never intended it to work. Main reason is that we can't
> support the semantics of "UPDATE OF" correctly (see bellow). And I think
> it's better to not support something at all rather than making it behave
> differently in different cases.
> 
> Now about the proposed patch, I don't think this is correct way to
> support this as it will only work when either PRIMARY KEY column was
> changed or when REPLICA IDENTITY is set to FULL for the table. And even
> then it will have very different semantics from how it works when the
> row is updated by SQL statement (all non-toasted columns will be
> reported as changed regardless of actually being changed or not).
> 
> The more proper way to do this would be to run data comparison of the
> new tuple and the existing tuple values which a) will have different
> behavior than normal "UPDATE OF" triggers have because we still can't
> tell what columns were mentioned in the original query and b) will not
> exactly help performance (and perhaps c) one can write the trigger to
> behave this way already without impacting all other use-cases).

I personally think that solution proposed by Masahiko is much better
than current behavior. We could document current limitations you've
mentioned and probably add a corresponding warning during execution of
ALTER TABLE ... ENABLE REPLICA TRIGGER. I don't insist on this
particular approach though.

What I really don't like is that PostgreSQL allows to create something
that supposedly should work but in fact doesn't. Such behavior is
obviously a bug. So as an alternative we could just return an error in
response to ALTER TABLE ... ENABLE REPLICA TRIGGER query for triggers
that we know will never be executed.

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


[HACKERS] Re: [BUGS] 10.0: Logical replication doesn't execute BEFORE UPDATE OF trigger

2017-10-10 Thread Petr Jelinek
On 10/10/17 09:53, Masahiko Sawada wrote:
> On Tue, Oct 10, 2017 at 11:29 AM, Masahiko Sawada  
> wrote:
>> On Mon, Oct 9, 2017 at 11:13 PM, Aleksander Alekseev
>>  wrote:
>>> Hi hackers,
>>>
>>> I've found something that looks like a bug.
>>>
>>> Steps to reproduce
>>> --
>>>
>>> There are 3 instances of PostgreSQL 10.0 - inst1, inst2 and inst3. There
>>> is a table `test` on every instance:
>>>
>>> ```
>>> CREATE TABLE test(k TEXT PRIMARY KEY, v TEXT);
>>> ```
>>>
>>> Both inst1 and inst2 have `allpub` publication:
>>>
>>> ```
>>> CREATE PUBLICATION allpub FOR ALL TABLES;
>>> ```
>>>
>>> ... and inst3 is subscribed for both publications:
>>>
>>> ```
>>> CREATE SUBSCRIPTION allsub1
>>>   CONNECTION 'host=10.128.0.16 user=eax dbname=eax'
>>>   PUBLICATION allpub;
>>>
>>> CREATE SUBSCRIPTION allsub2
>>>   CONNECTION 'host=10.128.0.26 user=eax dbname=eax'
>>>   PUBLICATION allpub;
>>> ```
>>>
>>> So basically it's two masters, one replica configuration. To resolve
>>> insert/update conflicts I've created the following triggers on inst3:
>>>
>>> ```
>>> CREATE OR REPLACE FUNCTION test_before_insert()
>>> RETURNS trigger AS $$
>>> BEGIN
>>>
>>> RAISE NOTICE 'test_before_insert trigger executed';
>>>
>>> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>>>RAISE NOTICE 'test_before_insert trigger - merging data';
>>>UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>>>RETURN NULL;
>>> END IF;
>>>
>>> RETURN NEW;
>>>
>>> END
>>> $$ LANGUAGE plpgsql;
>>>
>>>
>>> CREATE OR REPLACE FUNCTION test_before_update()
>>> RETURNS trigger AS $$
>>> BEGIN
>>>
>>> RAISE NOTICE 'test_before_update trigger executed';
>>>
>>> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>>>RAISE NOTICE 'test_before_update trigger - merging data';
>>>UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>>>DELETE FROM test where k = old.k;
>>>RETURN NULL;
>>> END IF;
>>>
>>> RETURN NEW;
>>>
>>> END
>>> $$ LANGUAGE plpgsql;
>>>
>>> create trigger test_before_insert_trigger
>>> before insert on test
>>> for each row execute procedure test_before_insert();
>>>
>>> create trigger test_before_update_trigger
>>> before update of k on test
>>> for each row execute procedure test_before_update();
>>>
>>> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_insert_trigger;
>>> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_update_trigger;
>>> ```
>>>
>>> The INSERT trigger works just as expected, however the UPDATE trigger
>>> doesn't. On inst1:
>>>
>>> ```
>>> insert into test values ('k1', 'v1');
>>> ```
>>>
>>> In inst2:
>>>
>>> ```
>>> insert into test values ('k4', 'v4');
>>> update test set k = 'k1' where k = 'k4';
>>> ```
>>>
>>> Now on inst3:
>>>
>>> ```
>>> select * from test;
>>> ```
>>>
>>> Expected result
>>> ---
>>>
>>> Rows are merged to:
>>>
>>> ```
>>>  k  |   v
>>> +---
>>>  k1 | v1;v4
>>> ```
>>>
>>> This is what would happen if all insert/update queries would have been
>>> executed on one instance.
>>>
>>> Actual result
>>> -
>>>
>>> Replication fails, log contains:
>>>
>>> ```
>>> [3227] ERROR:  duplicate key value violates unique constraint "test_pkey"
>>> [3227] DETAIL:  Key (k)=(k1) already exists.
>>> [3176] LOG:  worker process: logical replication worker for subscription 
>>> 16402 (PID 3227) exited with exit code 1
>>> ```
>>>
>>> What do you think?
>>>
>>
>> I think the cause of this issue is that the apply worker doesn't set
>> updatedCols of RangeTblEntry when applying updates. So TriggerEnabled
>> always ends up with false. I'll make a patch and submit.
>>
> 
> Attached patch store the updated columns bitmap set to RangeTblEntry.
> In my environment this bug seems to be fixed by the patch.
>
Hi,

let me start by saying that my view is that this is simply a
documentation bug. Meaning that I didn't document that it does not work,
but I also never intended it to work. Main reason is that we can't
support the semantics of "UPDATE OF" correctly (see bellow). And I think
it's better to not support something at all rather than making it behave
differently in different cases.

Now about the proposed patch, I don't think this is correct way to
support this as it will only work when either PRIMARY KEY column was
changed or when REPLICA IDENTITY is set to FULL for the table. And even
then it will have very different semantics from how it works when the
row is updated by SQL statement (all non-toasted columns will be
reported as changed regardless of actually being changed or not).

The more proper way to do this would be to run data comparison of the
new tuple and the existing tuple values which a) will have different
behavior than normal "UPDATE OF" triggers have because we still can't
tell what columns were mentioned in the original query and b) will not
exactly help performance (and perhaps c) one can write the trigger to
behave this way already without impacting all other use-cases).

-- 
  Petr Jelinek  http://www.2ndQua

[HACKERS] Re: [BUGS] 10.0: Logical replication doesn't execute BEFORE UPDATE OF trigger

2017-10-10 Thread Aleksander Alekseev
Hi Masahiko,

> > I think the cause of this issue is that the apply worker doesn't set
> > updatedCols of RangeTblEntry when applying updates. So TriggerEnabled
> > always ends up with false. I'll make a patch and submit.
> >
> 
> Attached patch store the updated columns bitmap set to RangeTblEntry.
> In my environment this bug seems to be fixed by the patch.

Thanks a lot for a quick response. I can confirm that your patch fixes
the issue and passes all tests. Hopefully someone will merge it shortly.

Here is another patch from me. It adds a corresponding TAP test. Before
applying your patch:

```
t/001_rep_changes.pl .. ok
t/002_types.pl  ok   
t/003_constraints.pl .. 1/5
#   Failed test 'check replica trigger with specified list of affected columns 
applied on subscriber'
#   at t/003_constraints.pl line 151.
#  got: 'k2|v1'
# expected: 'k2|v1
# triggered|true'
# Looks like you failed 1 test of 5.
t/003_constraints.pl .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/5 subtests
t/004_sync.pl . ok
t/005_encoding.pl . ok
t/006_rewrite.pl .. ok
t/007_ddl.pl .. ok
```

After:

```
t/001_rep_changes.pl .. ok
t/002_types.pl  ok
t/003_constraints.pl .. ok
t/004_sync.pl . ok
t/005_encoding.pl . ok
t/006_rewrite.pl .. ok
t/007_ddl.pl .. ok
All tests successful.
```

-- 
Best regards,
Aleksander Alekseev
diff --git a/src/test/subscription/t/003_constraints.pl b/src/test/subscription/t/003_constraints.pl
index 06863aef84..f1fc5ae863 100644
--- a/src/test/subscription/t/003_constraints.pl
+++ b/src/test/subscription/t/003_constraints.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 4;
+use Test::More tests => 5;
 
 # Initialize publisher node
 my $node_publisher = get_new_node('publisher');
@@ -21,6 +21,9 @@ $node_publisher->safe_psql('postgres',
 $node_publisher->safe_psql('postgres',
 "CREATE TABLE tab_fk_ref (id int PRIMARY KEY, bid int REFERENCES tab_fk (bid));"
 );
+$node_publisher->safe_psql('postgres',
+"CREATE TABLE tab_upd_tst (k TEXT PRIMARY KEY, v TEXT);"
+);
 
 # Setup structure on subscriber
 $node_subscriber->safe_psql('postgres',
@@ -28,6 +31,9 @@ $node_subscriber->safe_psql('postgres',
 $node_subscriber->safe_psql('postgres',
 "CREATE TABLE tab_fk_ref (id int PRIMARY KEY, bid int REFERENCES tab_fk (bid));"
 );
+$node_subscriber->safe_psql('postgres',
+"CREATE TABLE tab_upd_tst (k TEXT PRIMARY KEY, v TEXT);"
+);
 
 # Setup logical replication
 my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
@@ -112,5 +118,38 @@ $result = $node_subscriber->safe_psql('postgres',
 	"SELECT count(*), min(bid), max(bid) FROM tab_fk_ref;");
 is($result, qq(2|1|2), 'check replica trigger applied on subscriber');
 
+# Add replica trigger with specified list of affected columns
+$node_subscriber->safe_psql(
+	'postgres', qq{
+CREATE OR REPLACE FUNCTION upd_fn()
+RETURNS trigger AS \$\$
+BEGIN
+INSERT INTO tab_upd_tst VALUES ('triggered', 'true');
+RETURN NEW;
+END
+\$\$ LANGUAGE plpgsql;
+
+CREATE TRIGGER upd_trg
+BEFORE UPDATE OF k ON tab_upd_tst
+FOR EACH ROW EXECUTE PROCEDURE upd_fn();
+
+ALTER TABLE tab_upd_tst ENABLE REPLICA TRIGGER upd_trg;
+});
+
+# Insert data
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO tab_upd_tst (k, v) VALUES ('k1', 'v1');");
+$node_publisher->safe_psql('postgres',
+	"UPDATE tab_upd_tst SET k = 'k2' WHERE k = 'k1';");
+
+$node_publisher->poll_query_until('postgres', $caughtup_query)
+  or die "Timed out while waiting for subscriber to catch up";
+
+# Trigger should be executed
+$result =
+  $node_subscriber->safe_psql('postgres', "SELECT k, v FROM tab_upd_tst ORDER BY k");
+is( $result, qq(k2|v1
+triggered|true), 'check replica trigger with specified list of affected columns applied on subscriber');
+
 $node_subscriber->stop('fast');
 $node_publisher->stop('fast');


signature.asc
Description: PGP signature


[HACKERS] Re: [BUGS] 10.0: Logical replication doesn't execute BEFORE UPDATE OF trigger

2017-10-10 Thread Masahiko Sawada
On Tue, Oct 10, 2017 at 11:29 AM, Masahiko Sawada  wrote:
> On Mon, Oct 9, 2017 at 11:13 PM, Aleksander Alekseev
>  wrote:
>> Hi hackers,
>>
>> I've found something that looks like a bug.
>>
>> Steps to reproduce
>> --
>>
>> There are 3 instances of PostgreSQL 10.0 - inst1, inst2 and inst3. There
>> is a table `test` on every instance:
>>
>> ```
>> CREATE TABLE test(k TEXT PRIMARY KEY, v TEXT);
>> ```
>>
>> Both inst1 and inst2 have `allpub` publication:
>>
>> ```
>> CREATE PUBLICATION allpub FOR ALL TABLES;
>> ```
>>
>> ... and inst3 is subscribed for both publications:
>>
>> ```
>> CREATE SUBSCRIPTION allsub1
>>   CONNECTION 'host=10.128.0.16 user=eax dbname=eax'
>>   PUBLICATION allpub;
>>
>> CREATE SUBSCRIPTION allsub2
>>   CONNECTION 'host=10.128.0.26 user=eax dbname=eax'
>>   PUBLICATION allpub;
>> ```
>>
>> So basically it's two masters, one replica configuration. To resolve
>> insert/update conflicts I've created the following triggers on inst3:
>>
>> ```
>> CREATE OR REPLACE FUNCTION test_before_insert()
>> RETURNS trigger AS $$
>> BEGIN
>>
>> RAISE NOTICE 'test_before_insert trigger executed';
>>
>> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>>RAISE NOTICE 'test_before_insert trigger - merging data';
>>UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>>RETURN NULL;
>> END IF;
>>
>> RETURN NEW;
>>
>> END
>> $$ LANGUAGE plpgsql;
>>
>>
>> CREATE OR REPLACE FUNCTION test_before_update()
>> RETURNS trigger AS $$
>> BEGIN
>>
>> RAISE NOTICE 'test_before_update trigger executed';
>>
>> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>>RAISE NOTICE 'test_before_update trigger - merging data';
>>UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>>DELETE FROM test where k = old.k;
>>RETURN NULL;
>> END IF;
>>
>> RETURN NEW;
>>
>> END
>> $$ LANGUAGE plpgsql;
>>
>> create trigger test_before_insert_trigger
>> before insert on test
>> for each row execute procedure test_before_insert();
>>
>> create trigger test_before_update_trigger
>> before update of k on test
>> for each row execute procedure test_before_update();
>>
>> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_insert_trigger;
>> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_update_trigger;
>> ```
>>
>> The INSERT trigger works just as expected, however the UPDATE trigger
>> doesn't. On inst1:
>>
>> ```
>> insert into test values ('k1', 'v1');
>> ```
>>
>> In inst2:
>>
>> ```
>> insert into test values ('k4', 'v4');
>> update test set k = 'k1' where k = 'k4';
>> ```
>>
>> Now on inst3:
>>
>> ```
>> select * from test;
>> ```
>>
>> Expected result
>> ---
>>
>> Rows are merged to:
>>
>> ```
>>  k  |   v
>> +---
>>  k1 | v1;v4
>> ```
>>
>> This is what would happen if all insert/update queries would have been
>> executed on one instance.
>>
>> Actual result
>> -
>>
>> Replication fails, log contains:
>>
>> ```
>> [3227] ERROR:  duplicate key value violates unique constraint "test_pkey"
>> [3227] DETAIL:  Key (k)=(k1) already exists.
>> [3176] LOG:  worker process: logical replication worker for subscription 
>> 16402 (PID 3227) exited with exit code 1
>> ```
>>
>> What do you think?
>>
>
> I think the cause of this issue is that the apply worker doesn't set
> updatedCols of RangeTblEntry when applying updates. So TriggerEnabled
> always ends up with false. I'll make a patch and submit.
>

Attached patch store the updated columns bitmap set to RangeTblEntry.
In my environment this bug seems to be fixed by the patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


set_updated_columns.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] 10.0: Logical replication doesn't execute BEFORE UPDATE OF trigger

2017-10-09 Thread Masahiko Sawada
On Mon, Oct 9, 2017 at 11:13 PM, Aleksander Alekseev
 wrote:
> Hi hackers,
>
> I've found something that looks like a bug.
>
> Steps to reproduce
> --
>
> There are 3 instances of PostgreSQL 10.0 - inst1, inst2 and inst3. There
> is a table `test` on every instance:
>
> ```
> CREATE TABLE test(k TEXT PRIMARY KEY, v TEXT);
> ```
>
> Both inst1 and inst2 have `allpub` publication:
>
> ```
> CREATE PUBLICATION allpub FOR ALL TABLES;
> ```
>
> ... and inst3 is subscribed for both publications:
>
> ```
> CREATE SUBSCRIPTION allsub1
>   CONNECTION 'host=10.128.0.16 user=eax dbname=eax'
>   PUBLICATION allpub;
>
> CREATE SUBSCRIPTION allsub2
>   CONNECTION 'host=10.128.0.26 user=eax dbname=eax'
>   PUBLICATION allpub;
> ```
>
> So basically it's two masters, one replica configuration. To resolve
> insert/update conflicts I've created the following triggers on inst3:
>
> ```
> CREATE OR REPLACE FUNCTION test_before_insert()
> RETURNS trigger AS $$
> BEGIN
>
> RAISE NOTICE 'test_before_insert trigger executed';
>
> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>RAISE NOTICE 'test_before_insert trigger - merging data';
>UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>RETURN NULL;
> END IF;
>
> RETURN NEW;
>
> END
> $$ LANGUAGE plpgsql;
>
>
> CREATE OR REPLACE FUNCTION test_before_update()
> RETURNS trigger AS $$
> BEGIN
>
> RAISE NOTICE 'test_before_update trigger executed';
>
> IF EXISTS (SELECT 1 FROM test where k = new.k) THEN
>RAISE NOTICE 'test_before_update trigger - merging data';
>UPDATE test SET v = v || ';' || new.v WHERE k = new.k;
>DELETE FROM test where k = old.k;
>RETURN NULL;
> END IF;
>
> RETURN NEW;
>
> END
> $$ LANGUAGE plpgsql;
>
> create trigger test_before_insert_trigger
> before insert on test
> for each row execute procedure test_before_insert();
>
> create trigger test_before_update_trigger
> before update of k on test
> for each row execute procedure test_before_update();
>
> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_insert_trigger;
> ALTER TABLE test ENABLE REPLICA TRIGGER test_before_update_trigger;
> ```
>
> The INSERT trigger works just as expected, however the UPDATE trigger
> doesn't. On inst1:
>
> ```
> insert into test values ('k1', 'v1');
> ```
>
> In inst2:
>
> ```
> insert into test values ('k4', 'v4');
> update test set k = 'k1' where k = 'k4';
> ```
>
> Now on inst3:
>
> ```
> select * from test;
> ```
>
> Expected result
> ---
>
> Rows are merged to:
>
> ```
>  k  |   v
> +---
>  k1 | v1;v4
> ```
>
> This is what would happen if all insert/update queries would have been
> executed on one instance.
>
> Actual result
> -
>
> Replication fails, log contains:
>
> ```
> [3227] ERROR:  duplicate key value violates unique constraint "test_pkey"
> [3227] DETAIL:  Key (k)=(k1) already exists.
> [3176] LOG:  worker process: logical replication worker for subscription 
> 16402 (PID 3227) exited with exit code 1
> ```
>
> What do you think?
>

I think the cause of this issue is that the apply worker doesn't set
updatedCols of RangeTblEntry when applying updates. So TriggerEnabled
always ends up with false. I'll make a patch and submit.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers