On 11/01/16 21:59, Francois wrote: > Hi Coreutils > > today we can already give the oflag "nocache" to dd in the hope that > POSIX_FADV_DONTNEED will have the kernel discard the cache. In today's > implementation each time dd writes a buffer, it calls fadvise for *this > buffer only*. Unless sync or dsync is used as well, this is not very > efficient and we could improve it by advising the *whole file* instead. > > > > In practice, the memory written by dd is placed in writeback for a > kernel thread to write it asynchronously. The kernel cannot > really discard the memory. This thread > https://github.com/jborg/attic/issues/252 suggests that > > | It is indeed the case that FADV_DONTNEED will purge the file from the > cache immediately if it is not dirty (and will do nothing if it is > dirty). > > Even more in practice, I am experiencing slowdowns on my system, the > system becomes slow and unresponsive when copying big files. **Even with > the nocache** option, the system is barely usable. Xorg goes into > uninterruptible sleep doing page faults. > > I wanted to benchmark the effectiveness of the nocache parameter on my > 700MB RAM + 700MB swap virtual machine, running dd to overwrite a 2GB file > on my ext4 file system mounted with data=writeback, kernel > 4.3.0.5-generic from Ubuntu: > > /usr/bin/time -v /bin/dd if=/dev/zero of=out conv=notrunc bs=4096 > count=600000 iflag=nocache oflag=nocache > > I compared the performance of two versions of dd: > - dd delivered on coreutils 8.23-4ubuntu > - patched dd takes coreutils 8.23 and integrates a patch in this fashion: > > ======= > diff --git a/src/dd.c b/src/dd.c > index d5d01f3..7bed44b 100644 > --- a/src/dd.c > +++ b/src/dd.c > @@ -1062,7 +1062,7 @@ invalidate_cache (int fd, off_t len) > if (0 <= output_offset) > { > #if HAVE_POSIX_FADVISE > - adv_ret = posix_fadvise (fd, output_offset, clen, > + adv_ret = posix_fadvise (fd, 0, output_offset + clen, > POSIX_FADV_DONTNEED); > #else > errno = ENOTSUP; > > ======= > > This ensures that the advice runs on the full file already written, > including those dirty pages that were previously written but maybe > not yet processed by the journaling/writeback thread at the time. > > I'm using perf to count page faults for the whole system and for Xorg, > time -v dd to count context switches and report system time, and vmstat > to count swapping. I'm testing in this fashion: > > - boot the vm > - start firefox and run "10 hours of nyancat" > - runs a script that last one minute exactly. It starts dd and wait for > the minute to complete, then do that again 14 times. The script > completes in 15 minutes exactly, and 15 dd have run. > - halt the vm > > I do that 5 times: > - running dd without options (dd) > - dd oflag=sync (dds) > - dd oflag=sync,nocache (ddsn) > - dd oflag=nocache (ddn) > - dd oflag=nocache with the patch (pddn) > > I have added the numbers for the 15 runs below > > dd dds ddsn ddn pddn > system major-faults 2118 3402 252 2766 85 > Xorg major-faults 413 426 1 826 0 > dd cs 544957 289999 226657 451857 370731 > dd system time (s) 27.1 32.4 32.1 39.5 100.3 > > vmstat gives also interesting results. I put the median value in this > table: > > -----spwd---free---buf---cach--swap--bi--bo---in----cs-us--sys--id--wa-st- > dd 56548 12804 9848 340400 7 8 58 40084 691 1976 6 9 65 20 0 > ddn 57140 12000 9728 308360 12 9 46 40056 742 1856 6 9 65 21 0 > dds 45736 8984 16188 348996 15 12 88 40680 985 2770 6 5 61 27 0 > ddsn 13892 12884 116124 146096 0 4 6 40652 1045 2778 7 6 62 25 0 > pddn 3996 83716 9692 221776 0 0 1 40035 744 1763 6 12 70 12 0 > > What I conclude: > - that dd oflag=sync,nocache might be what you want if you want to > minimize swapping today. This should give you the best user experience > if you're running graphical applications at the same time. > I think we could add this information somewhere in the man page?
We doc the fdatasync combo at http://www.gnu.org/software/coreutils/dd But that is for the whole file. For oflag=sync to sync as the file is processed is a good suggestion. I'll add that. See attached. > - that dd oflag=nocache does not "discard the cache" > - that there might be an issue with how Xorg is mlocking or not its > memory pages, and that Linux distros swap defaults might be dangerous > - and maybe that we can do a better use of posix_advise. What would you > think of a oflag=nofilecache flag option? What do you mean for that do do exactly? If only to imply oflag=sync, then I'd prefer to handle this with just documentation as per the attached. syncing should be a very explicit thing and carefully considered. cheers, Pádraig.
From d10b6e6964e63ef94252550512ce4b72a53db1f9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]> Date: Mon, 11 Jan 2016 22:31:45 +0000 Subject: [PATCH] doc: suggest "sync" flag to maximize "nocache" effectiveness doc/coreutils.texi (dd invocation): Add oflag=sync to the streaming example. Also reference the "direct" flag. Mention this is only a request to the system. * src/dd.c (usage): Mention the "sync" flag along with "nocache". Also mention that it's only a request to drop the cache. * THANKS.in: Add reporter Francois Rigault. --- THANKS.in | 1 + doc/coreutils.texi | 17 ++++++++++++----- src/dd.c | 3 ++- 3 files changed, 15 insertions(+), 6 deletions(-) diff --git a/THANKS.in b/THANKS.in index 5c49006..4593f6c 100644 --- a/THANKS.in +++ b/THANKS.in @@ -207,6 +207,7 @@ Florian Schlichting [email protected] Florin Iucha [email protected] Francesco Montorsi [email protected] François Pinard [email protected] +François Rigault [email protected] Frank Adler [email protected] Frank T Lofaro [email protected] Fred Fish [email protected] diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 2538062..b0d66e3 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -8660,12 +8660,18 @@ Use synchronized I/O for both data and metadata. @item nocache @opindex nocache @cindex discarding file cache -Discard the data cache for a file. -When count=0 all cache is discarded, +Request to discard the system data cache for a file. +When count=0 all cached data for the file is specified, otherwise the cache is dropped for the processed -portion of the file. Also when count=0 +portion of the file. Also when count=0, failure to discard the cache is diagnosed and reflected in the exit status. + +Note data that is not already persisted to storage will not +be discarded from cache, so note the use of the ``sync'' options +in the examples below, which are used to maximize the +effectiveness of the @samp{nocache} flag. + Here are some usage examples: @example @@ -8678,8 +8684,9 @@ dd of=ofile oflag=nocache conv=notrunc,fdatasync count=0 # Drop cache for part of file dd if=ifile iflag=nocache skip=10 count=10 of=/dev/null -# Stream data using just the read-ahead cache -dd if=ifile of=ofile iflag=nocache oflag=nocache +# Stream data using just the read-ahead cache. +# See also the @samp{direct} flag. +dd if=ifile of=ofile iflag=nocache oflag=nocache,sync @end example @item nonblock diff --git a/src/dd.c b/src/dd.c index d5d01f3..440950a 100644 --- a/src/dd.c +++ b/src/dd.c @@ -632,7 +632,8 @@ Each FLAG symbol may be:\n\ fputs (_(" noatime do not update access time\n"), stdout); #if HAVE_POSIX_FADVISE if (O_NOCACHE) - fputs (_(" nocache discard cached data\n"), stdout); + fputs (_(" nocache Request to drop cache. See also oflag=sync\n"), + stdout); #endif if (O_NOCTTY) fputs (_(" noctty do not assign controlling terminal from file\n"), -- 2.5.0
