On Wed, May 16, 2018 at 4:42 PM, Arthur Zakirov <a.zaki...@postgrespro.ru> wrote: > I haven't deep knowledge about guts of invalidation callbacks. It seems > that there is problem with it. Tom pointed above: > >> > I'm not sure that I understood the second case correclty. Can cache >> > invalidation help in this case? I don't have confident knowledge of cache >> > invalidation. It seems to me that InvalidateTSCacheCallBack() should >> > release segment after commit. >> >> "Release after commit" sounds like a pretty dangerous design to me, >> because a release necessarily implies some kernel calls, which could >> fail. We can't afford to inject steps that might fail into post-commit >> cleanup (because it's too late to recover by failing the transaction). >> It'd be better to do cleanup while searching for a dictionary to use. > > But it is possible that I misunderstood his note.
I think you and Tom have misunderstood each other somehow. If you look at CommitTransaction(), you will see a comment that says: * This is all post-commit cleanup. Note that if an error is raised here, * it's too late to abort the transaction. This should be just * noncritical resource releasing. Between that point and the end of that function, we shouldn't do anything that throws an error, because the transaction is already committed and it's too late to change our mind. But if session A drops an object, session B is not going to get a callback to InvalidateTSCacheCallBack at that point. It's going to happen sometime in the middle of the transaction, like when it next tries to lock a relation or something. So Tom's complaint is irrelevant in that scenario. Also, there is no absolute prohibition on kernel calls in post-commit cleanup, or in no-fail code in general. For example, the RESOURCE_RELEASE_AFTER_LOCKS phase of resowner cleanup calls FileClose(). That's actually completely alarming when you really think about it, because one of the documented return values for close() is EIO, which certainly represents a very dangerous kind of failure -- see nearby threads about fsync-safety. Transaction abort acquires and releases numerous LWLocks, which can result in kernel calls that could fail. We're OK with that because, in practice, it never happens. Unmapping a DSM segment is probably about as safe as acquiring and releasing an LWLock, maybe safer. On my MacBook, the only documented return value for munmap is EINVAL, and any such error would indicate a PostgreSQL bug (or a kernel bug, or a cosmic ray hit). I checked a Linux system; things there are less clear, because mmap and mumap share a single man page, and mmap can fail for all kinds of reasons. But very few of the listed error codes look like things that could legitimately happen during munmap. Also, if munmap did fail (or shmdt/shmctl if using System V shared memory), it would be reported as a WARNING, not an ERROR, so we'd still be sorta OK. I think the only real question here is whether it's safe, at a high level, to drop the object at time T0 and have various backends drop the mapping at unpredictable later times T1, T2, ... all greater than T0. Generally, one wants to remove all references to an object before the object itself, which in this case we can't. Assuming that we can convince ourselves that that much is OK, I don't see why using a syscache callback to help ensure that the mappings are blown away in an at-least-somewhat-timely fashion is worse than any other approach. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company