Re: possible deadlock: different lock ordering for heap pages
On 1/31/19, 9:21 PM, "Amit Kapila" wrote: > BTW, do you have a reproducible test case for this fix? How did you catch this, browsing code? Yes, I observed it while reading code. I do not have a repro. > How do we pronounce your name, is Nishant Fnu okay? I would like to mention you as the author of the patch, so need to write your name. Yes, Nishant Fnu is okay. Thanks Nishant
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Hi, > To achieve this we can allocate Form_pg_class structures (for shared > relations… a small number) on shared memory. But why would this be necessary / a good idea? Even if we decided it were, it seems like it'd end up being quite invasive. But I doubt it's a good plan, because relcache entries want / need to be updated differently in the transaction that does the changes (as it needs to see the effect of catalog changes before commit) than other sessions (which only should see them after commit). It will be a good idea as, we can avoid maintaining file (creation/deletion/updation) for period of engine running and we do not need to invalidate other backend cache. For transaction(which change catalog), we can allocate a new private Form_pg_class structure and do operations on it and update shared structure post commit. Routine roughly will be- 1) A backend having a relcache entry for a shared rel where rd_rel part points to shared memory structure. Rel->rd_rel is storing a shared memory pointer. 2) Wants to update the entry...allocate a new private structure and memcpy the shared content to this new memory and point rd_rel to private memory Rel->rd_rel is storing a pointer to process specific structure. 3) Transaction committed and we copy the private memory content to shared memory area and point rd_rel again to shared memory. 4) free private memory. 5) Other backends do not do any invalidation but still get the latest updated values. Why is this good idea? Lets take an example (assuming we have 1000 postgres backends running). With shared memory scheme- Operation wise, for a transaction, we allocate/free once (private memory allocation) and memcpy data to and fro (from shared to private and back to shared)... Overall memory footprint 1 shared copy and 1 private only when updating. No file creation/deletion/updation. With current file scheme- Operation wise, for a transaction, we use private cache but we need to invalidate 1000 other caches( which will be atleast 1000 memcpy and allocate/free) and may involve reading back in page of pg_class. Overall memory footprint 1000 private copies. We have to create/delete/update init file and synchronize around it. Having said that we may not worry about transaction for updating all values...(I think relfrozenxid can be updated by a CAS operation ...still thinking on it). -Nishant
Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Hi, We were working on this issue and thinking if we could actually make pg_class(rd_rel) part of recache entry upgradable. To achieve this we can allocate Form_pg_class structures (for shared relations… a small number) on shared memory. We do not need global pg_internal_init file as new backend during boot up will be set to point at already stored Form_pg_class structure. Thanks, Nishant On 5/27/18, 1:01 PM, "Andres Freund"wrote: Hi, On 2018-05-27 13:22:21 -0400, Tom Lane wrote: > But I don't think there's any need for special magic here: we just > have to accept the fact that there's a need to flush that cache > sometimes. In normal use it shouldn't happen often enough to be a > performance problem. Yea, it's not that problematic. We already remove the local init file. I started out trying to write a version of invalidation that also WAL logs shared inval, but that turns out to be hard to do without breaking compatibilty. So I think we should just always unlink the shared one - that seems to work well. > FWIW, I'm not on board with "memcpy the whole row". I think the right > thing is more like what we do in RelationReloadIndexInfo, ie copy over > the specific fields that we expect to be mutable. But that's what RelationReloadIndexInfo() etc do? relp = (Form_pg_class) GETSTRUCT(pg_class_tuple); memcpy(relation->rd_rel, relp, CLASS_TUPLE_SIZE); I don't think we need to modify anything outside of rd_rel at this point? I've a patch that seems to work, that mostly needs some comment polishing. Greetings, Andres Freund