In heapam_relation_copy_for_cluster(), begin_heap_rewrite() sets rwstate->rs_new_rel->rd_smgr correctly but next line tuplesort_begin_cluster() get called which cause the system cache invalidation and due to CCA setting, wipe out rwstate->rs_new_rel->rd_smgr which wasn't restored for the subsequent operations and causes segmentation fault.
By calling RelationOpenSmgr() before calling smgrimmedsync() in end_heap_rewrite() would fix the failure. Did the same in the attached patch. Regards, Amul On Mon, Mar 22, 2021 at 11:53 AM Neha Sharma <neha.sha...@enterprisedb.com> wrote: > > Hello, > > While executing the below test case server crashed with Segfault 11 on master > branch. > I have enabled the CLOBBER_CACHE_ALWAYS in src/include/pg_config_manual.h > > Issue is only reproducing on master branch. > > Test Case: > CREATE TABLE sm_5_323_table (col1 numeric); > CREATE INDEX sm_5_323_idx ON sm_5_323_table(col1); > > CLUSTER sm_5_323_table USING sm_5_323_idx; > > \! /PGClobber_build/postgresql/inst/bin/clusterdb -t sm_5_323_table -U edb -h > localhost -p 5432 -d postgres > > Test case output: > edb@edb:~/PGClobber_build/postgresql/inst/bin$ ./psql postgres > psql (14devel) > Type "help" for help. > > postgres=# CREATE TABLE sm_5_323_table (col1 numeric); > CREATE TABLE > postgres=# CREATE INDEX sm_5_323_idx ON sm_5_323_table(col1); > CREATE INDEX > postgres=# CLUSTER sm_5_323_table USING sm_5_323_idx; > CLUSTER > postgres=# \! /PGClobber_build/postgresql/inst/bin/clusterdb -t > sm_5_323_table -U edb -h localhost -p 5432 -d postgres > clusterdb: error: clustering of table "sm_5_323_table" in database "postgres" > failed: server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > > Stack Trace: > Core was generated by `postgres: edb postgres 127.0.0.1(50978) CLUSTER > '. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x000055e5c85ea0b4 in mdopenfork (reln=0x0, forknum=MAIN_FORKNUM, > behavior=1) at md.c:485 > 485 if (reln->md_num_open_segs[forknum] > 0) > (gdb) bt > #0 0x000055e5c85ea0b4 in mdopenfork (reln=0x0, forknum=MAIN_FORKNUM, > behavior=1) at md.c:485 > #1 0x000055e5c85eb2f0 in mdnblocks (reln=0x0, forknum=MAIN_FORKNUM) at > md.c:768 > #2 0x000055e5c85eb61b in mdimmedsync (reln=0x0, > forknum=forknum@entry=MAIN_FORKNUM) at md.c:930 > #3 0x000055e5c85ec6e5 in smgrimmedsync (reln=<optimized out>, > forknum=forknum@entry=MAIN_FORKNUM) at smgr.c:662 > #4 0x000055e5c81ae28b in end_heap_rewrite (state=state@entry=0x55e5ca5d1d70) > at rewriteheap.c:342 > #5 0x000055e5c81a32ea in heapam_relation_copy_for_cluster > (OldHeap=0x7f212ce41ba0, NewHeap=0x7f212ce41058, OldIndex=<optimized out>, > use_sort=<optimized out>, OldestXmin=<optimized out>, > xid_cutoff=<optimized out>, multi_cutoff=0x7ffcba6ebe64, > num_tuples=0x7ffcba6ebe68, tups_vacuumed=0x7ffcba6ebe70, > tups_recently_dead=0x7ffcba6ebe78) at heapam_handler.c:984 > #6 0x000055e5c82f218a in table_relation_copy_for_cluster > (tups_recently_dead=0x7ffcba6ebe78, tups_vacuumed=0x7ffcba6ebe70, > num_tuples=0x7ffcba6ebe68, multi_cutoff=0x7ffcba6ebe64, > xid_cutoff=0x7ffcba6ebe60, OldestXmin=<optimized out>, > use_sort=<optimized out>, OldIndex=0x7f212ce40670, NewTable=0x7f212ce41058, > OldTable=0x7f212ce41ba0) > at ../../../src/include/access/tableam.h:1656 > #7 copy_table_data (pCutoffMulti=<synthetic pointer>, pFreezeXid=<synthetic > pointer>, pSwapToastByContent=<synthetic pointer>, verbose=<optimized out>, > OIDOldIndex=<optimized out>, > OIDOldHeap=16384, OIDNewHeap=<optimized out>) at cluster.c:908 > #8 rebuild_relation (verbose=<optimized out>, indexOid=<optimized out>, > OldHeap=<optimized out>) at cluster.c:604 > #9 cluster_rel (tableOid=<optimized out>, indexOid=<optimized out>, > params=<optimized out>) at cluster.c:427 > #10 0x000055e5c82f2b7f in cluster (pstate=pstate@entry=0x55e5ca5315c0, > stmt=stmt@entry=0x55e5ca510368, isTopLevel=isTopLevel@entry=true) at > cluster.c:195 > #11 0x000055e5c85fcbc6 in standard_ProcessUtility (pstmt=0x55e5ca510430, > queryString=0x55e5ca50f850 "CLUSTER public.sm_5_323_table;", > context=PROCESS_UTILITY_TOPLEVEL, params=0x0, > queryEnv=0x0, dest=0x55e5ca510710, qc=0x7ffcba6ec340) at utility.c:822 > #12 0x000055e5c85fd436 in ProcessUtility (pstmt=pstmt@entry=0x55e5ca510430, > queryString=<optimized out>, context=context@entry=PROCESS_UTILITY_TOPLEVEL, > params=<optimized out>, > queryEnv=<optimized out>, dest=dest@entry=0x55e5ca510710, > qc=0x7ffcba6ec340) at utility.c:525 > #13 0x000055e5c85f6148 in PortalRunUtility > (portal=portal@entry=0x55e5ca570d70, pstmt=pstmt@entry=0x55e5ca510430, > isTopLevel=isTopLevel@entry=true, > setHoldSnapshot=setHoldSnapshot@entry=false, > dest=dest@entry=0x55e5ca510710, qc=qc@entry=0x7ffcba6ec340) at pquery.c:1159 > #14 0x000055e5c85f71a4 in PortalRunMulti (portal=portal@entry=0x55e5ca570d70, > isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, > dest=dest@entry=0x55e5ca510710, altdest=altdest@entry=0x55e5ca510710, > qc=qc@entry=0x7ffcba6ec340) at pquery.c:1305 > #15 0x000055e5c85f8823 in PortalRun (portal=portal@entry=0x55e5ca570d70, > count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, > run_once=run_once@entry=true, > dest=dest@entry=0x55e5ca510710, altdest=altdest@entry=0x55e5ca510710, > qc=0x7ffcba6ec340) at pquery.c:779 > #16 0x000055e5c85f389e in exec_simple_query (query_string=0x55e5ca50f850 > "CLUSTER public.sm_5_323_table;") at postgres.c:1185 > #17 0x000055e5c85f51cf in PostgresMain (argc=argc@entry=1, > argv=argv@entry=0x7ffcba6ec670, dbname=<optimized out>, username=<optimized > out>) at postgres.c:4415 > #18 0x000055e5c8522240 in BackendRun (port=<optimized out>, port=<optimized > out>) at postmaster.c:4470 > #19 BackendStartup (port=<optimized out>) at postmaster.c:4192 > #20 ServerLoop () at postmaster.c:1737 > #21 0x000055e5c85237ec in PostmasterMain (argc=<optimized out>, > argv=0x55e5ca508fe0) at postmaster.c:1409 > #22 0x000055e5c811a2cf in main (argc=3, argv=0x55e5ca508fe0) at main.c:209 > > Thanks. > -- > Regards, > Neha Sharma
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c index 8241ba8f312..b44343a0166 100644 --- a/src/backend/access/heap/rewriteheap.c +++ b/src/backend/access/heap/rewriteheap.c @@ -339,7 +339,10 @@ end_heap_rewrite(RewriteState state) * wrote before the checkpoint. */ if (RelationNeedsWAL(state->rs_new_rel)) + { + RelationOpenSmgr(state->rs_new_rel); smgrimmedsync(state->rs_new_rel->rd_smgr, MAIN_FORKNUM); + } logical_end_heap_rewrite(state);