Srinath Reddy Sadipiralla <[email protected]> wrote:

> TRAP: failed Assert("RelationGetRelid(relation) == ((RepackDecodingState *) 
> ctx->output_writer_private)->relid"), File: "pgoutput_repack.c",
> Line: 97, PID: 397007

> This crash happens if we run REPACK (concurrently) on a table while a heavy
> pgbench workload is concurrently executing multi-table(setup.sql) 
> transactions(dual_chaos.sql).
> It triggers after a few back to back REPACK (concurrently) runs.
> 
> i think i found the cause for this crash , because there were some changes 
> which
> slipped under the nose of the change_useless_for_repack filter , which led 
> some
> changes which are not related to the relation which we are currently doing 
> REPACK (concurrently)
> got decoded and added into the reorderbuffer queue, the reason for this is 
> repacked_rel_locator.relNumber
> is by default set to InvalidOid, this is actually set to the target relation 
> during setup_logical_decoding
> but this done after DecodingContextFindStartpoint, in 
> DecodingContextFindStartpoint changes are not
> filtered even if its not related to the target relation , because 
> rm_decode->change_useless_for_repack->am_decoding_for_repack
> where repacked_rel_locator.relNumber is still InvalidOid, which makes it skip 
> the filtering even its not the target relation,
> this makes it to be added to reorder buffer queue, so during the processing 
> of reorder buffer plugin_change is called
> where assert fails, i have attached a diff patch to solve this.

Thanks a lot! Yes, your explanation makes sense. I'll include the fix in the
next version. I think it might also explain the other crash [1] you reported
earlier. I'll try to reproduce that.

[1] 
https://www.postgresql.org/message-id/CAFC%2Bb6o2yzA80YmfEhmMO9puN8qvGRvr-15BBLn3UmJxPfpr2w%40mail.gmail.com

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to