On Thu, Dec 28, 2023 at 4:02 AM Justin Pryzby <pry...@telsasoft.com> wrote: > My main question is why an IO error would cause the DB to abort, rather > than raising an ERROR.
In CommitTransaction() there is a stretch of code beginning s->state = TRANS_COMMIT and ending s->state = TRANS_DEFAULT, from which we call out to various subsystems' AtEOXact_XXX() functions. There is no way to roll back in that state, so anything that throws ERROR from those routines is going to get something much like $SUBJECT. Hmm, we'd know which exact code path got that EIO from your smoldering core if we'd put an explicit critical section there (if we're going to PANIC anyway, it might as well not be from a different stack after longjmp()...). I guess the large object usage isn't directly relevant (that module's EOXact stuff seems to be finished before TRANS_COMMIT, but I don't know that code well). Everything later is supposed to be about closing/releasing/cleaning up, and for example smgrDoPendingDeletes() reaches code with this relevant comment: * Note: smgr_unlink must treat deletion failure as a WARNING, not an * ERROR, because we've already decided to commit or abort the current * xact. We don't really have a general ban on ereporting on system call failure, though. We've just singled unlink() out. Only a few lines above that we call DropRelationsAllBuffers(rels, nrels), and that calls smgrnblocks(), and that might need to need to re-open() the relation file to do lseek(SEEK_END), because PostgreSQL itself has no tracking of relation size. Hard to say but my best guess is that's where you might have got your EIO, assuming you dropped the relation in this transaction? > This is pg16 compiled at efa8f6064, runing under centos7. ZFS is 2.2.2, > but the pool hasn't been upgraded to use the features new since 2.1. I've been following recent ZFS stuff from a safe distance as a user. AFAIK the extremely hard to hit bug fixed in that very recent release didn't technically require the interesting new feature (namely block cloning, though I think that helped people find the root cause after a phase of false blame?). Anyway, it had for symptom some bogus zero bytes on read, not a spurious EIO.