I wrote: > This patch looks fairly sane to me; I have a few small gripes about > coding style but that can be fixed while applying. Heikki, you were > concerned about the cycle-ID idea; do you have any objection to this > patch?
Actually, on second look I think the key idea here is Takahiro-san's introduction of a cancellation flag in the hashtable entries, to replace the cases where AbsorbFsyncRequests can try to delete entries. What that means is mdsync() doesn't need an outer retry loop at all: the periodic AbsorbFsyncRequests calls are not a hazard, and retry of FileSync failures can be handled as an inner loop on the single failing table entry. (We can make the failure counter a local variable, too, instead of needing space in every hashtable entry.) And with that change, it's no longer possible for an incoming stream of fsync requests to keep mdsync from terminating. It might fsync more than it really needs to, but it won't repeat itself, and it must reach the end of the hashtable eventually. So we don't actually need the cycle counter at all. It might be worth having the cycle counter anyway just to avoid doing "useless" fsync work. I'm not sure about this. If we have a cycle counter of say 32 bits, then it's theoretically possible for an fsync to fail 2^32 consecutive times and then be skipped on the next try, allowing a checkpoint to succeed that should not have. We can fix that with a few more lines of logic to detect a wrapped-around value, but is it worth the trouble? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster