On Friday, May 21, 2021 3:55 PM I wrote: > On Thursday, May 20, 2021 9:59 PM Amit Langote > <amitlangot...@gmail.com> wrote: > > Here are updated/divided patches. > Thanks for your updates. > > But, I've detected segmentation faults caused by the patch, which can > happen during 100_bugs.pl in src/test/subscription. > This happens more than one in ten times. > > This problem would be a timing issue and has been introduced by v3 already. > I used v5 for HEAD also and reproduced this failure, while OSS HEAD doesn't > reproduce this, even when I executed 100_bugs.pl 200 times in a tight loop. > I aligned the commit id 4f586fe2 for all check. Below logs are ones I got > from v3. > > * The message of the failure during TAP test. > > # Postmaster PID for node "twoways" is 5015 Waiting for replication conn > testsub's replay_lsn to pass pg_current_wal_lsn() on twoways # > poll_query_until timed out executing this query: > # SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' > FROM pg_catalog.pg_stat_replication WHERE application_name = 'testsub'; > # expecting this output: > # t > # last actual query output: > # > # with stderr: > # psql: error: connection to server on socket > "/tmp/cs8dhFOtZZ/.s.PGSQL.59345" failed: No such file or directory > # Is the server running locally and accepting connections on that > socket? > timed out waiting for catchup at t/100_bugs.pl line 148. > > > The failure produces core file and its back trace is below. > My first guess of the cause is that between the timing to get an entry from > hash_search() in get_rel_sync_entry() and to set the map by > convert_tuples_by_name() in maybe_send_schema(), we had invalidation > message, which tries to free unset descs in the entry ? Sorry, this guess was not accurate at all. Please ignore this because we need to have the entry->map set to free descs. Sorry for making noises.
Best Regards, Takamichi Osumi