This is an automated email from the ASF dual-hosted git repository.
vatamane pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/couchdb.git
The following commit(s) were added to refs/heads/main by this push:
new 5ed0654ab Use fdatasync for commits
5ed0654ab is described below
commit 5ed0654abd69748d769043ed677ca46c655db84c
Author: Nick Vatamaniuc <[email protected]>
AuthorDate: Mon Jan 13 15:59:25 2025 -0500
Use fdatasync for commits
We can use fdatasync to save 1 extra write per call, for a total of 2 writes
per commit, since we do two sync, one for data block up to the header, then
another after the header.
As of OTP 25 (our oldest supported version):
* On Linux/BSDs: fdatasync()
* On Window: FlushFileBuffers() i.e. the same as for file:sync/1
* On MacOS: fcntl(fd,F_FULLFSYNC/F_BARRIERFSYNC)
According to https://linux.die.net/man/2/fdatasync
> fdatasync() is similar to fsync(), but does not flush modified metadata
unless that metadata is needed in order to allow a subsequent data
retrieval
to be correctly handled. For example, changes to st_atime or
st_mtime (respectively, time of last access and time of last modification;
see
stat(2)) do not require flushing because they are not necessary for a
subsequent data read to be handled correctly. On the other hand, a change
to
the file size (st_size, as made by say ftruncate(2)), would require a
metadata
flush.
The key things for us are:
* It updates the size (positions) correctly
* We do not rely or care about atime/mtime for safety or correctness
* Erlang VM does the right thing on all the supported OSes
---
src/couch/src/couch_file.erl | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/src/couch/src/couch_file.erl b/src/couch/src/couch_file.erl
index 8616e039b..8c7370688 100644
--- a/src/couch/src/couch_file.erl
+++ b/src/couch/src/couch_file.erl
@@ -599,7 +599,12 @@ format_status(_Opt, [PDict, #file{} = File]) ->
fsync(Fd) ->
T0 = erlang:monotonic_time(),
- Res = file:sync(Fd),
+ % We do not rely on mtime/atime for our safety/consitency so we can use
+ % fdatasync. As of version 25 OTP will use:
+ % - On Linux/BSDs: fdatasync()
+ % - On Window: FlushFileBuffers() i.e. the same as for file:sync/1
+ % - On MacOS: fcntl(fd,F_FULLFSYNC/F_BARRIERFSYNC)
+ Res = file:datasync(Fd),
T1 = erlang:monotonic_time(),
% Since histograms can consume floating point values we can measure in
% nanoseconds, then turn it into floating point milliseconds