Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
* added docs * stopped destroying the FDs on sync * changed the name of the macro to be prefixed with an `_` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291592668___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
@n3npq Uh, I am doing it in lib/fsm.h not in Fclose(), I dunno what you are talking about. Yes, the fsync() is super expensive on rotation media, as mentioned, don't use it in those cases. As I mentioned, I don't have good numbers on the positive effects, only that our databases stall far less often with this on. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291636330___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
jaymzh commented on this pull request. > @@ -231,6 +232,9 @@ static int expandRegular(rpmfi fi, const char *dest, > rpmpsm psm, int nodigest, i exit: if (wfd) { int myerrno = errno; +if (rpmExpandNumeric("%{force_fsync_on_close}")) { `oneshot` is processed in `processSourceFiles` which won't apply here... as I imagine everythign in build/ is ... where would be an appropriate place to check for this and set it once? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#discussion_r109741819___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
jaymzh commented on this pull request. > @@ -1116,6 +1116,28 @@ int Fseek(FD_t fd, off_t offset, int whence) return rc; } +int Fsync(FD_t fd) +{ +int rc = 0, ec = 0; + +if (fd == NULL) + return -1; + +fd = fdLink(fd); +for (FDSTACK_t fps = fd->fps; fps != NULL; fps = fdPop(fd)) { Oh, good call. Copy/pasta from Fclose() :) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#discussion_r109741866___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
As to the name, the implementation has significant implications, and so I think the name should be kept as-is... -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291590349___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
Ah yes, good call @Conan-Kudo - will add that tomorrow. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291410528___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
@n3npq as I called out a few times... the examples I found were all in the build/ directory... maybe I'm just not seeing another good example. Moving to mmap() and madvise() is a major invasive change... if someone wants to take that on, awesome, but this is a simple solution that buys us a lot in the interim. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-292251567___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
To your other question we're still trying to come up with numbers, the system has many things going on and showing something 100% attributable to this is hard - I'm trying to see if we can get a more exact A/B test. madvise() doesn't help ensure there's less dirty data in the system, which is different from what you are describing. If the write-queue becomes too long, then anything else that needs to write now has to wait... which doesn't work well with a database load. Now mysql sync's and has to wait for a half-gig of data to be flushed. That's a problem. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-292278640___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
@pmatilai - I'm still looking for a good suggestion on where to read that macro and store it in a static variable. @pmatilai - I'm ok with fsync_periodically or some such, it just means that if we ever want to change it, we have no way to do it nicely... we can't make a new `%_fsync_on_mb 50` with some sort of nice transition where people can move from one to the other. That's why I liked fsync_on_close, it's clear what it does, and when we want to implement something else we can. But I don't feel strongly, if you want it to be _fsync_aggressively or _fsync_periodically or something, let me know. However, it *should* have something about fsync in it so it's clear people don't want to do this on rotational media. @n3npq - Yes, the slowdown is intolerable on rotational media. This is why its default off and an admin needs to make a smart choice. Like with mount-options, there's destructive options that are generally terrible, but we provide them for admins who need them. Like turning off ordered mode in ext4. :) @n3npq - As you correctly characterized we're just slowing down RPM... giving more even access to IO... but to be clear we're not *just* slowing it down. There's other implications as well. For example we have some machines with 18 or 24 disks attached and installing a big RPM blows own the disk cache and causes *massive* IO load as the kernel sucks it all back in as the app continues doing it's work. By limiting the amount in the buffercache we don't evict diskcache. We can always afford the buffercache of a *single* file, but when Chef comes along and installed 25 RPMs with varying sizes of and numbers of files, we shoot ourselves in the foot. This is a different case than filling the IO buffers on our database systems where when MySQL comes along and and there's a huge queue of stuff to be flushed, and we cause stalls (the original incentive for this work). We actually played a lot with sync-on-write, sync-on-transaction, sync-on-close, and it turns out sync-on-close works incredibly well for a variety of different cases ... provided you're not on rotational media. I'm aware we could do more complex stuff... but it's a pretty hairy code-base and a simple solution that has shown to provide significant benefit seems like an easy win. Which is why I sat down with @pmatilai and others at DevConf.cz to discuss this work and that the approach would be acceptable before writing the patch. @n3npq - yes, @pmatilai also pointed out I shouldn't read the macros repeatedly, and said I should put it in a static variable, just haven't found a good place to do that, per above, still looking for a suggestion. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-291959681___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
jaymzh commented on this pull request. > +int rc = 0, ec = 0; + +if (fd == NULL) + return -1; + +fd = fdLink(fd); +for (FDSTACK_t fps = fd->fps; fps != NULL; fps = fps->prev) { + if (fps->fdno >= 0) { +rc = fsync(fps->fdno); + if (ec == 0 && rc) + ec = rc; + } + + /* Leave freeing the last one after stats */ + if (fps->prev == NULL) + break; Ah. Woops, cleaned both of those up. Thanks. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#discussion_r109998465___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
@cgwalters - Jens works here... yes, the problem description is related, but while a system-wide solution can make this better, being able to administratively choose things that can never ever full the buffer is still important. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294250985___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
In our case ~ 1MB is what we want. This will never be a good idea for rotating disks... and also most people will never need this. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294064508___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
[Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
This patch adds a new macro that has rpm call fsync() before each close(). The feature is off by default since most people will not want this, and is turned on either with a macro or command-line flag. There are a few reasons for wanting this: 1. To ensure that an RPM installation never fulls IO queues so much that the production workload has to wait for it to be flushed when calling sync() (think of databases) 2. For systems with very large numbers of disks to limit the amount of diskcache an RPM installation can cause eviction of to limit IO thrash. At Facebook, for the described workloads, we've found significant improvement on our IO load when turning on this feature. Exact numbers are a bit hard to quantify, but we are working on them. We see less threads of production databases stacking up when yum/rpm/dnf are running, for example. One word of caution: this will give good results on SSDs, but using this on rotational disks won't make you happy. You can view, comment on, or merge this pull request online at: https://github.com/rpm-software-management/rpm/pull/187 -- Commit Summary -- * Add macro to force fsync() on close() -- File Changes -- M lib/fsm.c (4) M rpmio/rpmio.c (22) M rpmio/rpmio.h (7) M rpmpopt.in (2) -- Patch Links -- https://github.com/rpm-software-management/rpm/pull/187.patch https://github.com/rpm-software-management/rpm/pull/187.diff -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187 ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
OK, so here's a few different takes on numbers. First graphs. Graph one is with the macro on: https://g.defau.lt/#gluSzlb39ScnZU4FS2AtrYUF31IYct Graph2 is with the macro off: https://g.defau.lt/#6We3VgDfcXi7zkhMMzdesQLKD5iMUd The colors are reads, read bytes, writes, write bytes. You can see the load spreads out as expected We also managed to do a synthetic test which was interesting and showed that for our dominant use-case of RPMs with lots of small files this is a big win. This is the mount of time the storage system was not doing work: # fb-dba-core: ## With macro - many smaller files, one huge file 2017-04-06 16:26:57.722013873 -0700 PDT 5.73155ms 104.968µs 6 2017-04-06 16:26:58.722147493 -0700 PDT 5.252973ms 117.358µs 5 2017-04-06 16:26:59.722343955 -0700 PDT 7.422587ms 139.914µs 8 2017-04-06 16:27:00.72254866 -0700 PDT 6.109201ms 122.32µs 6 2017-04-06 16:27:01.722734509 -0700 PDT 7.40091ms 155.112µs 8 2017-04-06 16:27:02.722875662 -0700 PDT 7.794604ms 148.676µs 8 2017-04-06 16:27:12.723782443 -0700 PDT 208.33143ms 25.153745ms 202 2017-04-06 16:27:13.723915361 -0700 PDT 5.998775ms 133.564µs 7 2017-04-06 16:27:14.724062693 -0700 PDT 5.89304ms 132.267µs 7 2017-04-06 16:27:15.724212253 -0700 PDT 6.98323ms 117.4µs 8 2017-04-06 16:27:16.724324089 -0700 PDT 5.402875ms 132.129µs 5 2017-04-06 16:27:17.724437579 -0700 PDT 5.926757ms 127.274µs 7 2017-04-06 16:27:18.724621554 -0700 PDT 5.837904ms 121.583µs 6 2017-04-06 16:27:19.724740404 -0700 PDT 5.080387ms 115.304µs 5 2017-04-06 16:27:20.724855913 -0700 PDT 5.573811ms 131.026µs 6 2017-04-06 16:27:21.725034955 -0700 PDT 3.856319ms 102.18µs 4 2017-04-06 16:27:22.725188062 -0700 PDT 4.405144ms 111.026µs 5 ## Without macro 2017-04-06 16:27:55.728507697 -0700 PDT 27.318392ms 885.188µs 26 2017-04-06 16:28:02.729271342 -0700 PDT 7.49606ms 116.632µs 8 2017-04-06 16:28:15.730604087 -0700 PDT 84.956988ms 10.598699ms 83 2017-04-06 16:28:20.73111897 -0700 PDT 48.069056ms 457.543µs 53 2017-04-06 16:28:21.731312738 -0700 PDT 65.498909ms 2.925142ms 63 # fb-gcc-5-glibc-2.23-runtime-6-1.x86_64.rpm, some 100MB files ## With macro 2017-04-06 16:33:41.765007508 -0700 PDT 1.320882ms 88.294µs 3 2017-04-06 16:33:42.765164056 -0700 PDT 13.984165ms 279.486µs 13 2017-04-06 16:33:43.765326558 -0700 PDT 1.890576ms 80.318µs 2 2017-04-06 16:33:46.765670291 -0700 PDT 39.313771ms 1.022946ms 38 2017-04-06 16:33:47.765840591 -0700 PDT 3.801104ms 94.404µs 5 2017-04-06 16:33:51.766274676 -0700 PDT 67.966698ms 2.87646ms 67 2017-04-06 16:33:52.766424373 -0700 PDT 1.996679ms 90.686µs 2 2017-04-06 16:33:56.766825593 -0700 PDT 101.579654ms 5.967763ms 98 2017-04-06 16:34:02.767494189 -0700 PDT 19.626819ms 332.579µs 20 2017-04-06 16:34:03.767628848 -0700 PDT 4.940615ms 116.663µs 5 2017-04-06 16:34:04.76287 -0700 PDT 2.480112ms 97.978µs 4 2017-04-06 16:34:05.7679422 -0700 PDT 1.551916ms 62.483µs 2 2017-04-06 16:34:06.768067462 -0700 PDT 1.756435ms 68.526µs 2 2017-04-06 16:34:07.768176248 -0700 PDT 1.319394ms 87.179µs 2 2017-04-06 16:33:02.761068223 -0700 PDT 6.764001ms 99.465µs 7 ## Without macro 2017-04-06 16:34:59.773285651 -0700 PDT 1.695824ms 87.892µs 2 2017-04-06 16:35:02.773564104 -0700 PDT 5.68678ms 105.131µs 6 2017-04-06 16:35:12.774532746 -0700 PDT 65.836492ms 1.853472ms 70 2017-04-06 16:35:13.774667005 -0700 PDT 82.757401ms 8.41281ms 80 2017-04-06 16:35:18.77511384 -0700 PDT 73.289284ms 3.436747ms 71 2017-04-06 16:35:24.775796768 -0700 PDT 1.199151ms 85.286µs 2 2017-04-06 16:35:26.77602982 -0700 PDT 1.203794ms 105.355µs 2 2017-04-06 16:35:43.777684277 -0700 PDT 51.357162ms 2.053203ms 50 So large enough files can still cause problems, but it's not perfect. Madvise isn't quite the same here... though certainly keeping track and always writing after a certain number of bytes would be even better, but I think this shows this diff is a good start. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-292681920___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
@pmatilai - I'd like to get something merged sooner rather than later. There are two options I see - (1) my patch (plus moving the parsing of the file into some static variable) or (2) @n3npq patch, but turning it off by default. Those are the two safe, clean, easy methods going forward. Can you weigh in here, so we can move forward? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294570200___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
In regards to your questions an optimal implementation can come whenever, I'd like to get something merged nowish that I can opt-in to on hosts that t need it. So changing your patch to default to off, and then leaving it at the sync rate its at is fine. 1MB is probably nicer to the disk but 32k is totally fine. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294217509___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)
Updated this PR to use a static variable. I'm still fine with either solution, but @pmatilai and @megaumi @ffesti - can someone weigh in, please? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/187#issuecomment-305674583___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)
On a related note... our fix-the-rpm-db program has been opensourced: https://github.com/facebookincubator/dcrpm We run it every 15 minutes as a pre-script to configuration management. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365214211___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)
Note that we don't run db_recover unless we detect an issue. dcrpm works by trying to detect common issues and issue the nicest possible recovery - from db_recover to other finding held locks by bad actors to other things. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365508327___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)
Hence this Issue was filed. :) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365676303___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)
I'd be very surprised if this wasn't a cause of many of our problems in Linux too. I suspect a common case is someone querying the DB (`rpm -qa`) when an transaction happens, and then theoretically-read-only operation ends up writing stale data back, but I haven't actually had the time to sit down and try to pinpoint it. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365824199___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)
@jianwei1216 Are the processes running as root? If so, the memmap() the DB read-write, and write it back out when they are done... hence if then something changes at the same time, one of those process can write back old pages. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557176087___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)
You could instead run it as nonroot: `su -c "rpm -q " nobody` (or possibly `su -c "rpm -q " -s /bin/bash nobody` if nobody's shell is not a real shell). Or you can make sure [dcrpm](https://github.com/facebookincubator/dcrpm) runs in a regular basis to detect and correct your RPM db... but certainly preventing the issues is better. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557379740___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint