Fix relid-set clobber during join removal. Commit cfcd57111 et al fell over under Valgrind testing. (It seems to be enough to #define USE_VALGRIND, you don't actually need to run it under Valgrind to see failures.) The cause is that remove_rel_from_eclass updates each EquivalenceMember's em_relids, and those can be aliases of the left_relids or right_relids of some RestrictInfo in ec_sources. If the update made em_relids empty then bms_del_member will have pfree'd the relid set, so that the subsequent attempt to clean up ec_sources accesses already-freed memory.
We missed seeing ill effects before cfcd57111 because (a) if the pfree happens then we will remove the EquivalenceMember altogether, making the source RestrictInfo no longer of use, and (b) the cleanup of ec_sources didn't touch left/right_relids before that. I'm unclear though on how cfcd57111 managed to pass non-USE_VALGRIND testing. Apparently we managed to store another Bitmapset into the freed space before trying to access it, but you'd not think that would happen 100% of the time. I think what USE_VALGRIND changes is that it makes list.c much more memory-hungry, so that the freed space gets claimed by some List node before a Bitmapset can be put there. This failure can be seen in v16, v17, and master, but oddly enough not v18. That's because the SJE patch replaced the simple bms_del_members calls used here with adjust_relid_set, which is careful not to scribble on its input. But commit 20efbdffe just recently put back the old coding and thus resurrected the problem. Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16, 17, master Branch ------ REL_16_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/798dabe8388764a8a9979f5c91237f807cd09188 Modified Files -------------- src/backend/optimizer/plan/analyzejoins.c | 2 ++ 1 file changed, 2 insertions(+)
