Closed #187.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#event-1305249129___
Rpm-maint mailing list
Rpm-maint@lists.rpm.o
Functionality included in slightly different for as of commit
4087530f0fcbb14167be8296957e44e6ffc97579, thanks for your work on this, not to
mention patience.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.c
Updated this PR to use a static variable.
I'm still fine with either solution, but @pmatilai and @megaumi @ffesti - can
someone weigh in, please?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-softwa
Yes, fdatasync+fadvise_syncfs removes pages from system cache:
`
--> Fincore(0x61408e40) fdno 16 path
/opt/local/tmp/lib/modules/4.11.0-0.rc6.git2.1.fc27.x86_64/System.map;58f7761a
<-- Fincore(0x61408e40) fdno 16 rc 0
<-- fdClose(0x61408e40) rc 0| UFD -1 fp 0x61204e40
FDI
@cgwalters: easy enough to wrap syncfs(2) into a measurement harness:
`
==> Fclose(0x61408e40) | LIBIO 0x61204fc0(-1) fdno -1 | UFD 16 fp
0x61204fc0
--> Fincore(0x61408e40) fdno 16 path
/opt/local/tmp/lib/modules/4.11.0-0.rc6.git2.1.fc27.x86_64/System.map;58f6dc66
<-- Fincor
@n3npq Regarding full-system durability - I'm the maintainer of
https://github.com/projectatomic/rpm-ostree/ which uses
https://github.com/ostreedev/ostree/ to implement full-system transactional
updates. Every change (package install, update) is atomic. libostree
currently [stages objects pe
@pmatilai - I'd like to get something merged sooner rather than later. There
are two options I see - (1) my patch (plus moving the parsing of the file into
some static variable) or (2) @n3npq patch, but turning it off by default. Those
are the two safe, clean, easy methods going forward.
Can yo
@pmatila: if you *do* attempt to configure +/- fsync(2) somehow, then
durability is likely the better approach. By "durability", I mean "this file is
essential to booting" (and hence is worth paying the cost of fsync(2) to ensure
that the file is correctly written to disk).
Some simple patterns
Wall clock time measurements for an RPM command (this is RPM5, but the I/O
trace is identical to rpm.org):
`sudo /usr/bin/time /opt/local/bin/rpm -Uvh --nodeps --noscripts --relocate
/=/opt/local/tmp /var/tmp/kernel-core-4.11.0-0.rc6.git2.1.fc27.x86_64.rpm`
BASE
2.26user 0.43system 0:03.90
@jaymzh But are your tests with a kernel with the writeback changes or not? It
looks like this commit?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e34cbd307477ae07c5d8a8d0bd15e65a9ddaba5c
Which offhand appears to be in v4.10. Ah yes, [kernelnewbies has a
link
@cgwalters - Jens works here... yes, the problem description is related, but
while a system-wide solution can make this better, being able to
administratively choose things that can never ever full the buffer is still
important.
--
You are receiving this because you are subscribed to this thr
Yes, with about 20 other similar articles and implementations, some even
accurate.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294249350___
Isn't this https://lwn.net/Articles/682582/ ?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294231945___
Rpm-m
Grrr 32 * 8 = 256Kb ...
I can post an updated patch if that helps merging. The amount of code involved
is quite small (and is perfectly obvious: move fdatasync out of the loop,
perform before advise()).
I doubt that better than fallocate -> Fwrite-in-loop -> fdatasync+fadvise+fsync
is possibl
The buffer is 32 * BUFSIZ )= 128Kb) not 32Kb. Sorry for the confusion.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294225532
In regards to your questions an optimal implementation can come whenever, I'd
like to get something merged nowish that I can opt-in to on hosts that t need
it. So changing your patch to default to off, and then leaving it at the sync
rate its at is fine. 1MB is probably nicer to the disk but 32k
One last note: doing fdatasync once, before fadvise+fsync, not after every
write is ~10x faster:
`
FDIO: 30 writes, 3763601 total bytes in 0.003558 secs
FDIO: 1 allocs, 3763601 total bytes in 0.24 secs
FDIO: 1 dsyncs,0 total bytes in 0.074081 secs
FDIO: 1
Here are some measurements (on RPM5, but its essentially the same
implementation).
The system is Fedora 25, the disk is 7200rpm rotating media, dual 4 core xeons,
etc.
The System.map file in the current RawHide kernel-core file was used to measure
the time taken by the various system calls inv
If you are arguing for "opt-in" rather than "opt-out" by default when a macro
is not explicitly set, then change the "1" in this line:
`
_periodic_fsync =
rpmExpandNumeric("%{?_periodic_fsync}%{!?_periodic_fsync:1}");
`
Otherwise set %_periodic_fsync as you wish.
There wasn't an
In our case ~ 1MB is what we want. This will never be a good idea for rotating
disks... and also most people will never need this.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rp
A measurement of the impact is needed first, enabler/disabler configuration can
always be added.
In general, RPM does best with "opt-out", not "opt-in" policies. There may be
other
changes, such as doing the fdatasync() less often, may be useful too.
--
You are receiving this because you are s
OK this actually seems to work quiet well for the SSD case - you're
fdatasync()ing every 32kB which is nice.
We'd not want it on by default - any place with a non-ssd is going to go poorly.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view
Thanks for measurements.
So fsync on rotating DASD is ~9x slower for your db-gcc-runtime test (IIUC).
I'm now investigating the effect of fallocate(2) to reserve disk blocks (and
prevent ENOSPC returns) on the code path we are iterating upon.
News at 11pm ;-)
--
You are receiving this because
OK, so some install time differences using this PR:
SPINNING DISK
fb-dba-core: 0m22.954s vs 0m27.326s
fb-gcc-runtime: 0m23.687s vs 3m42.792s
SSD
fb-dba-core: 0m22.642s vs 0m23.692s
fb-gcc-runtime: 0m20.857s vs 0m34.003s
As you can see the difference on SSD is trivial and well worth it. On spinni
Here is an updated patch, including the bare minimum changes with a
%_periodic_fsync "opt-out" macro to the "production"
rpmfiArchiveReadToFilePsm() to use fdatasync+posix_fadvise+fsync.
I've left all the debugging gook in the patch to document the --rpmiodebug
trace and to illustrate the minco
OK, here is one way to install a file without leaving the file in the cache.
The simplest explanation is to show the --rpmiodebug log (interspersed with
some debugging):
*** Fopen ufdio path /usr/share/doc/hello-2.0/README;58ea9807 fmode w+.ufdio
*** ufdOpen(/usr/share/doc/hello-2.0/README;58ea9
I knew if I wrote a patch, better answers would instantly appear.
There is fadvise(FADV_DONTNEED) wrto fsync(2) analogous to
madvise(MADV_DONTNEED) wrto msync(2). I will add to my patch in the next few
days for measurement, todo++.
The easy answer for your "Chef installs packages and my databas
As promised, here is a proof-of-concept patch to permit using either
fwrite+fsync or mmap+madvise+msync while writing files onto the file system
from a *.rpm package.
The patch passes "make check" (with the current known failures) on tip.
There are various switches to enable/disable the added f
Thanks for measurements! Could you also attempt to measure the effect on RPM
too? Wall clock time is sufficient, as the time necessary to perform an upgrade
is what most RPM users are sensitive to.
Your graphs (presumably on SSD?) show smaller/periodic writes w fsync enabled,
and larger/delayed
OK, so here's a few different takes on numbers.
First graphs. Graph one is with the macro on:
https://g.defau.lt/#gluSzlb39ScnZU4FS2AtrYUF31IYct
Graph2 is with the macro off:
https://g.defau.lt/#6We3VgDfcXi7zkhMMzdesQLKD5iMUd
The colors are reads, read bytes, writes, write bytes. You can see the
@jayzmh: major invasive change to ... RPM? or something else?
I'll happily write the code for rpm.org RPM if you wish. I fail to see how
reverting to what RPM used to do (i.e. use mmap with MADV_DONTNEED and most
importantly for you using msync(MS_ASYNC) which schedules flush to disk like
fsync
To your other question we're still trying to come up with numbers, the system
has many things going on and showing something 100% attributable to this is
hard - I'm trying to see if we can get a more exact A/B test.
madvise() doesn't help ensure there's less dirty data in the system, which is
d
@n3npq as I called out a few times... the examples I found were all in the
build/ directory... maybe I'm just not seeing another good example.
Moving to mmap() and madvise() is a major invasive change... if someone wants
to take that on, awesome, but this is a simple solution that buys us a lot
@jayzmh - add a static variable to expand the macro as a "one shot" ... there
are many places in rpm that do that, hurts little (until rpm can capture global
configuration object).
Meanwhile I recommend the *_DONTNEED mmap implementation instead of fsync. RPM
used that for years, and *_DONTNEED
@pmatilai - I'm still looking for a good suggestion on where to read that macro
and store it in a static variable.
@pmatilai - I'm ok with fsync_periodically or some such, it just means that if
we ever want to change it, we have no way to do it nicely... we can't make a
new `%_fsync_on_mb 50`
jaymzh commented on this pull request.
> +int rc = 0, ec = 0;
+
+if (fd == NULL)
+ return -1;
+
+fd = fdLink(fd);
+for (FDSTACK_t fps = fd->fps; fps != NULL; fps = fps->prev) {
+ if (fps->fdno >= 0) {
+rc = fsync(fps->fdno);
+ if (ec == 0 && rc)
There is a modestly obscure means to attach *_DONTNEED (to achieve kernel
delayed writes and removing pages when I/O is complete) using the existing
rpm.org rpmfiArchiveReadToFilePsm() Fwrite->write(3) loop calling setvbuf(3)
immediately after Fopen on the fopencookie(3) FILE *.
(aside)
Persona
Hmmm ... glibc (which uses mmap(2) for stdio if requested) would benefit from
being able to mark the stdio buffer-in-use with *_DONTNEED using
madvise/posix_fadvise.
At a minimum, cvtfmode() in rpmio/rpmio.c would need to be extended to pass the
glibc-peculier 'm' flag through to fopen(3)
For completeness (and humor ;-), there is also O_DIRECT. See
[https://lkml.org/lkml/2007/1/10/233](url)
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291
Another issue is that certain file systems (NFS? DFS? I forget ...) also delay
I/O operations until close(2).
What I recall is that an application MUST check the return code of close(2) in
order to ensure that the I/O was actually performed: it isn't sufficient to
check the return code of write
There is an alternative (but more complicated) lazy implementation that might
be attempted by doing a fsync in Fwrite (or fdWrite) when the accumulated I/O
passes a certain mark. H ...
A general solution to add a resource cap on all rpmio I/O could also be
attempted (but isn't easy). Berkel
@jaymzh - yes the implementation has significant implications, which is pretty
much my objection: you're not calling fsync on close to make data safer
somehow, you call it there just because it's a simple implementation of getting
rpm to call fsync more often in order to balance the IO load. So
One other nitpick: macro expansion is cheap but not free.
So rpmExpandNumeric("%{_force_fsync_on_close}") should be called once and
saved, not repeatedly.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/r
FYI: several notes (after looking at the patch)
1) The measurements I cited above are using mmap(2) to decompress directly into
pages (avoiding a buffer copy) marked with MADV_DONTNEED (which permits
delayed/larger writes) to avoid a buffer copy. In addition, there is a check on
file size of <=
pmatilai commented on this pull request.
> +int rc = 0, ec = 0;
+
+if (fd == NULL)
+ return -1;
+
+fd = fdLink(fd);
+for (FDSTACK_t fps = fd->fps; fps != NULL; fps = fps->prev) {
+ if (fps->fdno >= 0) {
+rc = fsync(fps->fdno);
+ if (ec == 0 && rc
Good if doing in fsm.h ( I haven't looked).
Meanwhile a 50x slower installation is intolerable for a package manager even
if opt-in.
Nor can an application like RPM be expected to detect whether the underlying
filesystem is on SSD or USB or rotational disk and adapt accordingly: there
simply a
@n3npq Uh, I am doing it in lib/fsm.h not in Fclose(), I dunno what you are
talking about.
Yes, the fsync() is super expensive on rotation media, as mentioned, don't use
it in those cases.
As I mentioned, I don't have good numbers on the positive effects, only that
our databases stall far less
Do you have measurements of the effect of fsync()?
The last time sync/fdatasync were suggested for RPM that I am aware of was
~2011 in Mandriva. The issue comes up every few years this century ...
Here was the effect of adding fsync/fdatasync to RPM, measure on otherwise
identical system with a
* added docs
* stopped destroying the FDs on sync
* changed the name of the macro to be prefixed with an `_`
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment
jaymzh commented on this pull request.
> @@ -1116,6 +1116,28 @@ int Fseek(FD_t fd, off_t offset, int whence)
return rc;
}
+int Fsync(FD_t fd)
+{
+int rc = 0, ec = 0;
+
+if (fd == NULL)
+ return -1;
+
+fd = fdLink(fd);
+for (FDSTACK_t fps = fd->fps; fps != NULL; fps
As to the name, the implementation has significant implications, and so I think
the name should be kept as-is...
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issueco
jaymzh commented on this pull request.
> @@ -231,6 +232,9 @@ static int expandRegular(rpmfi fi, const char *dest,
> rpmpsm psm, int nodigest, i
exit:
if (wfd) {
int myerrno = errno;
+if (rpmExpandNumeric("%{force_fsync_on_close}")) {
`oneshot` is processed in `processSour
I think the option is nice, and it's a long option, so it's not holding any
limited space such as single letters. It can help users test this and see the
effect of installing a single RPM, or it could enable them to set it only for
longer-run transactions (yum upgrade) or specific larger RPMs or
pmatilai commented on this pull request.
> @@ -1116,6 +1116,28 @@ int Fseek(FD_t fd, off_t offset, int whence)
return rc;
}
+int Fsync(FD_t fd)
+{
+int rc = 0, ec = 0;
+
+if (fd == NULL)
+ return -1;
+
+fd = fdLink(fd);
+for (FDSTACK_t fps = fd->fps; fps != NULL; fp
pmatilai commented on this pull request.
> @@ -231,6 +232,9 @@ static int expandRegular(rpmfi fi, const char *dest,
> rpmpsm psm, int nodigest, i
exit:
if (wfd) {
int myerrno = errno;
+if (rpmExpandNumeric("%{force_fsync_on_close}")) {
You don't really want to invoke the
Actually I'm more on the side of questioning the need for a command line switch
in the first place. This is a highly special tweak that you'd permanently
enable and forget on those systems benefitting from it AIUI, and not something
you enable for every third transaction on the system.
--
You
Ah yes, good call @Conan-Kudo - will add that tomorrow.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291410528___
@jaymzh Could you add information to `doc/rpm.8` about the command line option?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291382337___
58 matches
Mail list logo