Hi all, I stumbled across https://ldpreload.com/blog/signalfd-is-useless and wondered how this squares against our use of fsync().
A quick glance at https://github.com/erlang/otp/blob/master/erts/emulator/drivers/unix/unix_efile.c reveals that EINTR is handled in multiple places, but only in read/write/sendfile functions, but not fsync. I also tried to trace the calling code of efile_fsync() (or efile_fdatasync()), but I got lost pretty quickly in some dtrace macro indirections, so I don’t know if there is any retry logic higher up. I’m not experienced enough here to make a call, but does that mean that we have a possible scenario where EINTR interrupts an fsync call after which a crash (machine or CouchDB) leaves part of a database not fsynced? Or would the failing fsync bubble up to the corresponding, say, PUT request handler? How about with delayed_commits=true, is the possible data-loss window then 2 seconds rather than the documented 1s? Can anyone shed any light on this? Best Jan --
