Moved to -hackers.

On Sat, Jul 29, 2017 at 4:35 AM, Scott Milliken <sc...@deltaex.com> wrote:
> Thank you Masahiko! I've tested and confirmed that this patch fixes the
> problem.
>

Thank you for the testing. This issue should be added to the open item
since this cause of the server crash. I'll add it.

> On Fri, Jul 28, 2017 at 3:07 AM, Masahiko Sawada <sawada.m...@gmail.com>
> wrote:
>>
>> On Mon, Jul 24, 2017 at 4:22 PM,  <sc...@deltaex.com> wrote:
>> > The following bug has been logged on the website:
>> >
>> > Bug reference:      14758
>> > Logged by:          Scott Milliken
>> > Email address:      sc...@deltaex.com
>> > PostgreSQL version: 10beta2
>> > Operating system:   Linux 4.10.0-27-generic #30~16.04.2-Ubuntu S
>> > Description:
>> >
>> > I'm testing logical replication on 10beta2, and found a segfault that I
>> > can
>> > reliably reproduce with an index on a not-actually immutable function.
>> >
>> > Here's the function in question:
>> >
>> > ```
>> > CREATE OR REPLACE FUNCTION public.immutable_random(integer)
>> >  RETURNS double precision
>> >  LANGUAGE sql
>> >  IMMUTABLE
>> > AS $function$SELECT random();
>> > $function$;
>> > ```
>> >
>> > It's not actually immutable since it's calling random (a hack to get an
>> > efficient random sort on a table).
>> >
>> > (Aside: I'd understand if it errored on creation of the index, but would
>> > really prefer to keep using this instead of tablesample because it's
>> > fast,
>> > deterministic, and doesn't have sampling biases like the SYSTEM
>> > sampling.)
>> >
>> >
>> > Here's full reproduction instructions:
>> >
>> >
>> > Primary:
>> > ```
>> > mkdir -p /tmp/test-seg0
>> > PGPORT=5301 initdb -D /tmp/test-seg0
>> > echo "wal_level = logical" >> /tmp/test-seg0/postgresql.conf
>> > PGPORT=5301 pg_ctl -D /tmp/test-seg0 start
>> > for (( ; ; )); do if pg_isready -d postgres -p 5301; then break; fi;
>> > sleep
>> > 1; done
>> > psql -p 5301 postgres -c "CREATE USER test WITH PASSWORD 'test'
>> > SUPERUSER
>> > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;"
>> > createdb -p 5301 -E utf8 test
>> >
>> > psql -p 5301 -U test test -c "CREATE TABLE testtbl (id int, name text);"
>> > psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT
>> > testtbl_pkey PRIMARY KEY (id);"
>> > psql -p 5301 -U test test -c "CREATE PUBLICATION testpub FOR TABLE
>> > testtbl;"
>> > psql -p 5301 -U test test -c "INSERT INTO testtbl (id, name) VALUES (1,
>> > 'a');"
>> > ```
>> >
>> > Secondary:
>> > ```
>> > mkdir -p /tmp/test-seg1
>> > PGPORT=5302 initdb -D /tmp/test-seg1
>> > PGPORT=5302 pg_ctl -D /tmp/test-seg1 start
>> > for (( ; ; )); do if pg_isready -d postgres -p 5302; then break; fi;
>> > sleep
>> > 1; done
>> > psql -p 5302 postgres -c "CREATE USER test WITH PASSWORD 'test'
>> > SUPERUSER
>> > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;"
>> > createdb -p 5302 -E utf8 test
>> >
>> > psql -p 5302 -U test test -c "CREATE TABLE testtbl (id int, name text);"
>> > psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT
>> > testtbl_pkey PRIMARY KEY (id);"
>> > psql -p 5302 -U test test -c 'CREATE FUNCTION
>> > public.immutable_random(integer) RETURNS double precision LANGUAGE sql
>> > IMMUTABLE AS $function$ SELECT random(); $function$'
>> > psql -p 5302 -U test test -c "CREATE INDEX ix_testtbl_random ON testtbl
>> > USING btree (immutable_random(id));"
>> > psql -p 5302 -U test test -c "CREATE SUBSCRIPTION test0_testpub
>> > CONNECTION
>> > 'port=5301 user=test dbname=test' PUBLICATION testpub;"
>> > ```
>> >
>> > The secondary crashes with a segfault:
>> >
>> > ```
>> > 2017-07-23 23:55:37.961 PDT [4823] LOG:  logical replication table
>> > synchronization worker for subscription "test0_testpub", table "testtbl"
>> > has started
>> > 2017-07-23 23:55:38.244 PDT [4758] LOG:  worker process: logical
>> > replication
>> > worker for subscription 16396 sync 16386 (PID 4823) was terminated by
>> > signal
>> > 11: Segmentation fault
>> > 2017-07-23 23:55:38.244 PDT [4758] LOG:  terminating any other active
>> > server
>> > processes
>> > 2017-07-23 23:55:38.245 PDT [4763] WARNING:  terminating connection
>> > because
>> > of crash of another server process
>> > 2017-07-23 23:55:38.245 PDT [4763] DETAIL:  The postmaster has commanded
>> > this server process to roll back the current transaction and exit,
>> > because
>> > another server process exited
>> >  abnormally and possibly corrupted shared memory.
>> > 2017-07-23 23:55:38.245 PDT [4763] HINT:  In a moment you should be able
>> > to
>> > reconnect to the database and repeat your command.
>> > 2017-07-23 23:55:38.247 PDT [4758] LOG:  all server processes
>> > terminated;
>> > reinitializing
>> > 2017-07-23 23:55:38.256 PDT [4826] LOG:  database system was
>> > interrupted;
>> > last known up at 2017-07-23 23:55:36 PDT
>> > 2017-07-23 23:55:38.809 PDT [4826] LOG:  database system was not
>> > properly
>> > shut down; automatic recovery in progress
>> > 2017-07-23 23:55:38.812 PDT [4826] LOG:  redo starts at 0/173AEA0
>> > 2017-07-23 23:55:38.815 PDT [4826] LOG:  invalid record length at
>> > 0/17B50B0:
>> > wanted 24, got 0
>> > 2017-07-23 23:55:38.815 PDT [4826] LOG:  redo done at 0/17B5070
>> > 2017-07-23 23:55:38.815 PDT [4826] LOG:  last completed transaction was
>> > at
>> > log time 2017-07-23 23:55:37.962957-07
>> > ```
>> >
>>
>> Thank you for the reporting and precise reproducing steps!
>> I could reproduced this issue and it seems to me that the cause of
>> this is that the table sync worker didn't get a snapshot before
>> starting table copy. Attached patch fixes this problem.
>>
>> Regards,
>>
>> --
>> Masahiko Sawada
>> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> NTT Open Source Software Center
>
>

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to