On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha <iderihatake...@gmail.com> wrote: > > >We forgot to set/unset the flag in functions > >systable_beginscan_ordered and systable_endscan_ordered. BTW, > > Thank you for the clarification. > > >shouldn't this occur even without prepare transaction? If so, we need > >to backpatch this till 14. > > Yes, it occurred also at PG14. > I did some tests using 015_stream.pl with some modification like > below, which tests the subscription about stream is on but two-phase > is off. > The same issue occurred at both current head source code and PG14. > > ``` > --- a/src/test/subscription/t/015_stream.pl > +++ b/src/test/subscription/t/015_stream.pl > @@ -134,9 +134,11 @@ my $node_subscriber = > PostgreSQL::Test::Cluster->new('subscriber'); > $node_subscriber->init; > $node_subscriber->start; > > +my $default = join('', map {chr(65 + rand 26)} (1 .. 10000)); > + > # Create some preexisting content on publisher > $node_publisher->safe_psql('postgres', > - "CREATE TABLE test_tab (a int primary key, b bytea)"); > + "CREATE TABLE test_tab (a int primary key, b bytea, t text > DEFAULT '$default')"); > $node_publisher->safe_psql('postgres', > "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')"); > > @@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE > TABLE test_tab_2 (a int)"); > > # Setup structure on subscriber > $node_subscriber->safe_psql('postgres', > - "CREATE TABLE test_tab (a int primary key, b bytea, c > timestamptz DEFAULT now(), d bigint DEFAULT 999)" > + "CREATE TABLE test_tab (a int primary key, b bytea, t text > DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT > 999)" > ); > ``` > > Logs from head source: > ``` > 2024-10-01 12:34:56.694 UTC [575202] LOG: starting PostgreSQL 18devel > on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat > 11.4.1-2), 64-bit > ... > 2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: BEGIN; > 2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: > INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM > generate_series(3, 5000) s(i); > 2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR: unexpected > table_index_fetch_tuple call during logical decoding > 2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4', > streaming 'on', origin 'any', publication_names '"tap_pub"') > 2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: released logical > replication slot "tap_sub" > 2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: could not send data > to client: Broken pipe > 2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL: connection to client lost > ... > 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4', > streaming 'on', origin 'any', publication_names '"tap_pub"') > 2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: starting logical > decoding for slot "tap_sub" > 2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: Streaming > transactions committing after 0/1583A68, reading WAL from 0/1583A30. > 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4', > streaming 'on', origin 'any', publication_names '"tap_pub"') > 2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: logical decoding > found consistent point at 0/1583A30 > 2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: There are no > running transactions. > 2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4', > streaming 'on', origin 'any', publication_names '"tap_pub"') > 2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR: unexpected > table_index_fetch_tuple call during logical decoding > 2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4', > streaming 'on', origin 'any', publication_names '"tap_pub"') > 2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: released logical > replication slot "tap_sub" > 2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: could not send data > to client: Broken pipe > 2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL: connection to client lost > ``` > > Logs from PG14 source: > > ``` > 2024-10-01 13:20:29.409 UTC [593696] LOG: starting PostgreSQL 14.9 on > x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat > 11.4.1-2), 64-bit > ... > 2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: BEGIN; > 2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: > INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3, > 5000) s(i); > 2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR: unexpected > table_index_fetch_tuple call during logical decoding > 2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2', > streaming 'on', publication_names '"tap_pub"') > 2024-10-01 13:20:31.302 UTC [593740] tap_sub LOG: could not send data > to client: Broken pipe > 2024-10-01 13:20:31.302 UTC [593740] tap_sub FATAL: connection to client lost > ... > 2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2', > streaming 'on', publication_names '"tap_pub"') > 2024-10-01 13:20:31.830 UTC [593756] tap_sub LOG: logical decoding > found consistent point at 0/172C758 > 2024-10-01 13:20:31.830 UTC [593756] tap_sub DETAIL: There are no > running transactions. > 2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2', > streaming 'on', publication_names '"tap_pub"') > 2024-10-01 13:20:31.834 UTC [593756] tap_sub ERROR: unexpected > table_index_fetch_tuple call during logical decoding > 2024-10-01 13:20:31.834 UTC [593756] tap_sub STATEMENT: > START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2', > streaming 'on', publication_names '"tap_pub"') > 2024-10-01 13:20:31.835 UTC [593756] tap_sub LOG: could not send data > to client: Broken pipe > 2024-10-01 13:20:31.835 UTC [593756] tap_sub FATAL: connection to client lost > ``` > > >Also, it is better to have a test for this, and let's ensure that the > >new test doesn't increase the regression time too much if possible. > > Sure. I'm going to add some test codes in a few days. >
The above test shown in the email will work to test this issue. However, we should use 'debug_logical_replication_streaming' as that will help to reproduce the issue with just one write operation. Is it possible to add the test in 018_stream_subxact_abort where we already use it? -- With Regards, Amit Kapila.