Srinath Reddy Sadipiralla <[email protected]> wrote: > TRAP: failed Assert("RelationGetRelid(relation) == ((RepackDecodingState *) > ctx->output_writer_private)->relid"), File: "pgoutput_repack.c", > Line: 97, PID: 397007
> This crash happens if we run REPACK (concurrently) on a table while a heavy > pgbench workload is concurrently executing multi-table(setup.sql) > transactions(dual_chaos.sql). > It triggers after a few back to back REPACK (concurrently) runs. > > i think i found the cause for this crash , because there were some changes > which > slipped under the nose of the change_useless_for_repack filter , which led > some > changes which are not related to the relation which we are currently doing > REPACK (concurrently) > got decoded and added into the reorderbuffer queue, the reason for this is > repacked_rel_locator.relNumber > is by default set to InvalidOid, this is actually set to the target relation > during setup_logical_decoding > but this done after DecodingContextFindStartpoint, in > DecodingContextFindStartpoint changes are not > filtered even if its not related to the target relation , because > rm_decode->change_useless_for_repack->am_decoding_for_repack > where repacked_rel_locator.relNumber is still InvalidOid, which makes it skip > the filtering even its not the target relation, > this makes it to be added to reorder buffer queue, so during the processing > of reorder buffer plugin_change is called > where assert fails, i have attached a diff patch to solve this. Thanks a lot! Yes, your explanation makes sense. I'll include the fix in the next version. I think it might also explain the other crash [1] you reported earlier. I'll try to reproduce that. [1] https://www.postgresql.org/message-id/CAFC%2Bb6o2yzA80YmfEhmMO9puN8qvGRvr-15BBLn3UmJxPfpr2w%40mail.gmail.com -- Antonin Houska Web: https://www.cybertec-postgresql.com
