Jim Meyering wrote: > Eric Sandeen mentioned that dd's O_DIRECT-exposing code > didn't always work. The problem is that the kernel imposes > draconian restrictions on the size of buffer that one may > write to an FD opened with O_DIRECT. Currently, at least on > ext4 and with a recent linux kernel, that buffer size must > be a multiple of 512, which happens to be dd's default buffer size.
(there are alignment restrictions too, just FWIW) ... > Note also that the code makes no attempt to determine the > appropriate block size. So if you chose obs=42, you lose. > The adventurous user who experiments with oflags=direct > must also select a block size that works with O_DIRECT > and the destination file system. I think that's fine, docs can say "subject to size and alignment requirements of the kernel and filesystem" or something I think. > Here's the patch I expect to push soon: > > From 2f4004be6131e049a1452ee68377ae1756673bd4 Mon Sep 17 00:00:00 2001 > From: Jim Meyering <[email protected]> > Date: Tue, 4 Aug 2009 19:54:58 +0200 > Subject: [PATCH] dd: work around buffer length restrictions with oflag=direct > (O_DIRECT) > > * src/dd.c (iwrite): Turn off O_DIRECT for any > smaller-than-obs-sized write. Don't bother to restore it. > * tests/dd/direct: New test for the above. > * tests/Makefile.am (TESTS): Add od/direct. > * doc/coreutils.texi (dd invocation): Mention oflags=direct > buffer size restriction. > Reported by Eric Sandeen. > --- > doc/coreutils.texi | 4 ++++ > src/dd.c | 10 +++++++++- > tests/Makefile.am | 1 + > tests/dd/direct | 40 ++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 54 insertions(+), 1 deletions(-) > create mode 100755 tests/dd/direct > > diff --git a/doc/coreutils.texi b/doc/coreutils.texi > index acec76e..6fa2602 100644 > --- a/doc/coreutils.texi > +++ b/doc/coreutils.texi > @@ -7861,6 +7861,10 @@ dd invocation > @opindex direct > @cindex direct I/O > Use direct I/O for data, avoiding the buffer cache. > +Note that the kernel may impose restrictions on read or write buffer sizes. > +For example, with an ext4 destination file system and a linux-based kernel, > +using @samp{oflags=direct} will cause writes to fail with @code{EINVAL} if > the > +output buffer size is not a multiple of 512. I dunno how much O_DIRECT tutorial you want here but alignment matters too :) > @item directory > @opindex directory > diff --git a/src/dd.c b/src/dd.c > index 9a9d22a..43ad718 100644 > --- a/src/dd.c > +++ b/src/dd.c > @@ -837,6 +837,14 @@ iwrite (int fd, char const *buf, size_t size) > { > size_t total_written = 0; > > + if ((output_flags & O_DIRECT) && size < output_blocksize) > + { > + int old_flags = fcntl (STDOUT_FILENO, F_GETFL); > + if (fcntl (STDOUT_FILENO, F_SETFL, old_flags & ~O_DIRECT) != 0) > + error (0, errno, _("failed to turn off O_DIRECT: %s"), > + quote (output_file)); > + } > + > while (total_written < size) > { > ssize_t nwritten; I suppose it would be nice to at least make that last buffered IO synchronous if possible, or sync the file afterwards. POSIX_FADV_DONTNEED for extra credit? :) Thanks, -Eric
