Paul reminded me that for small files, their storage might be completely within an inode. Now that would not be cleared if one was writing a full block, which shred does by default. Now that full block write to clear the slack space at the end of a file is useful, especially is used in a consistent manner to remove all files from a file system, so to cater for both use cases you could write the small amount first, and then pad out to the full block.
Now we shouldn't jump through any hoops to improve shredding of regular files, since there are so many edge cases we can't handle with journalling file systems etc., but in this case the improvement is easy and so probably worth adding. Proposed untested implementation attached. Also noted therein is the possiblity for also handling: http://en.wikipedia.org/wiki/Tail_packing thanks, Pádraig.
>From 89c974daa6a6cac0238783c8116afa8a95fe2584 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]> Date: Fri, 4 Apr 2014 13:35:56 +0100 Subject: [PATCH] shred: overwrite inode storage used by some file systems * doc/coreutils.texi (shred invocation): Mention some reasons why clearing slack space might be useful. * src/shred.c (do_wipefd): Add initial writes for each pass for small regular files in case the storage for those is in the inode, and thus a larger write up to a block size would bypass that. Move the direct I/O control to... (dopass): ... here so we can avoid enabling it for these small initial writes. It's better to retry direct I/O for each pass anyway to handle the case where direct I/O is disabled for only the last portion of a file when the size is not a multiple of the block size. Note we don't avoid the sync for the initial write as it will be small but more importantly could be on a different part of the disk and so worth doing independently. * NEWS: Mention the improvements. The inode storage issue was mentioned by Paul Eggert. --- NEWS | 4 +- doc/coreutils.texi | 5 +- src/shred.c | 112 ++++++++++++++++++++++++++++++---------------------- 3 files changed, 71 insertions(+), 50 deletions(-) diff --git a/NEWS b/NEWS index c6451b2..9f06d1e 100644 --- a/NEWS +++ b/NEWS @@ -58,7 +58,9 @@ GNU coreutils NEWS -*- outline -*- in case the look-up within the chroot fails due to library conflicts etc. shred now supports multiple passes on GNU/Linux tape devices by rewinding - the tape before each pass. Also redundant writes to empty files are avoided. + the tape before each pass, avoids redundant writes to empty files, + uses direct I/O for all passes where possible, and clears inode storage + used for small files on some file systems. split avoids unnecessary input buffering, immediately writing input to output which is significant with --filter or when writing to fifos or stdout etc. diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 75875d8..6c49385 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -9634,8 +9634,9 @@ Display to standard error all status updates as sterilization proceeds. @opindex -x @opindex --exact By default, @command{shred} rounds the size of a regular file up to the next -multiple of the file system block size to fully erase the last block -of the file. +multiple of the file system block size to fully erase the slack space in +the last block of the file. This space may contain portions of the current +system memory on some systems for example. Use @option{--exact} to suppress that behavior. Thus, by default if you shred a 10-byte regular file on a system with 512-byte blocks, the resulting file will be 512 bytes long. With this option, diff --git a/src/shred.c b/src/shred.c index ed37051..187446f 100644 --- a/src/shred.c +++ b/src/shred.c @@ -406,12 +406,12 @@ dorewind (int fd, struct stat const *st) } /* - * Do pass number k of n, writing "size" bytes of the given pattern "type" - * to the file descriptor fd. Qname, k and n are passed in only for verbose - * progress message purposes. If n == 0, no progress messages are printed. + * Do pass number K of N, writing *SIZEP bytes of the given pattern TYPE + * to the file descriptor FD. K and N are passed in only for verbose + * progress message purposes. If N == 0, no progress messages are printed. * - * If *sizep == -1, the size is unknown, and it will be filled in as soon - * as writing fails. + * If *SIZEP == -1, the size is unknown, and it will be filled in as soon + * as writing fails with ENOSPC. * * Return 1 on write error, -1 on other error, 0 on success. */ @@ -428,10 +428,6 @@ dopass (int fd, struct stat const *st, char const *qname, off_t *sizep, size_t soff; /* Offset into buffer for next write */ ssize_t ssize; /* Return value from write */ - /* Do nothing for --size=0 or regular empty files. */ - if (size == 0) - return 0; - /* Fill pattern buffer. Aligning it to a page so we can do direct I/O. */ size_t page_size = getpagesize (); #define PERIODIC_OUTPUT_SIZE (60 * 1024) @@ -448,12 +444,18 @@ dopass (int fd, struct stat const *st, char const *qname, off_t *sizep, char pass_string[PASS_NAME_SIZE]; /* Name of current pass */ bool write_error = false; bool other_error = false; - bool tried_without_directio = false; /* Printable previous offset into the file */ char previous_offset_buf[LONGEST_HUMAN_READABLE + 1]; char const *previous_human_offset IF_LINT ( = 0); + /* As a performance tweak, avoid direct I/O for small sizes, + as it's just a performance rather then security consideration, + and direct I/O can often be unsupported for small non aligned sizes. */ + bool try_without_directio = 0 < size && size < output_size; + if (! try_without_directio) + direct_mode (fd, true); + if (! dorewind (fd, st)) { error (0, errno, _("%s: cannot rewind"), qname); @@ -517,11 +519,11 @@ dopass (int fd, struct stat const *st, char const *qname, off_t *sizep, at all on some (file) systems, or with the current size. I.E. a specified --size that is not aligned, or when dealing with slop at the end of a file with --exact. */ - if (k == 1 && !tried_without_directio && errno == EINVAL) + if (k == 1 && !try_without_directio && errno == EINVAL) { direct_mode (fd, false); ssize = 0; - tried_without_directio = true; + try_without_directio = true; continue; } error (0, errnum, _("%s: error writing at offset %s"), @@ -841,13 +843,14 @@ do_wipefd (int fd, char const *qname, struct randint_source *s, { size_t i; struct stat st; - off_t size; /* Size to write, size to read */ - unsigned long int n; /* Number of passes for printing purposes */ + off_t size; /* Size to write, size to read */ + off_t i_size = 0; /* For small files, initial size to overwrite inode */ + unsigned long int n; /* Number of passes for printing purposes */ int *passarray; bool ok = true; struct randread_source *rs; - n = 0; /* dopass takes n -- 0 to mean "don't print progress" */ + n = 0; /* dopass takes n == 0 to mean "don't print progress" */ if (flags->verbose) n = flags->n_iterations + flags->zero_fill; @@ -868,8 +871,6 @@ do_wipefd (int fd, char const *qname, struct randint_source *s, return false; } - direct_mode (fd, true); - /* Allocate pass array */ passarray = xnmalloc (flags->n_iterations, sizeof *passarray); @@ -881,21 +882,21 @@ do_wipefd (int fd, char const *qname, struct randint_source *s, size = st.st_size; if (size < 0) { + ok = false; error (0, 0, _("%s: file has negative size"), qname); - return false; + goto wipefd_out; } if (! flags->exact) { /* Round up to the nearest blocksize to clear slack space. */ off_t remainder = size % ST_BLKSIZE (st); + if (size && size < ST_BLKSIZE (st)) + i_size = size; if (remainder != 0) { off_t size_incr = ST_BLKSIZE (st) - remainder; - if (! INT_ADD_OVERFLOW (size, size_incr)) - size += size_incr; - else - size = OFF_T_MAX; + size += MIN (size_incr, OFF_T_MAX - size); } } } @@ -913,53 +914,70 @@ do_wipefd (int fd, char const *qname, struct randint_source *s, } } } + else if (S_ISREG (st.st_mode)) + { + off_t fsize = st.st_size; + if (fsize > 0 && fsize < ST_BLKSIZE (st) && size > fsize) + i_size = fsize; + } /* Schedule the passes in random order. */ genpattern (passarray, flags->n_iterations, s); rs = randint_get_source (s); - /* Do the work */ - for (i = 0; i < flags->n_iterations; i++) + while (true) { - int err = dopass (fd, &st, qname, &size, passarray[i], rs, i + 1, n); - if (err) + off_t pass_size; + unsigned long int pn = n; + + if (i_size) { - if (err < 0) - { - memset (passarray, 0, flags->n_iterations * sizeof (int)); - free (passarray); - return false; - } - ok = false; + pass_size = i_size; + i_size = 0; + pn = 0; } - } - - memset (passarray, 0, flags->n_iterations * sizeof (int)); - free (passarray); + else if (size) + { + pass_size = size; + size = 0; + } + /* TODO: consider handling tail packing by + writing the tail padding as a separate pass, + (that would not rewind). */ + else + break; - if (flags->zero_fill) - { - int err = dopass (fd, &st, qname, &size, 0, rs, - flags->n_iterations + 1, n); - if (err) + for (i = 0; i < flags->n_iterations + flags->zero_fill; i++) { - if (err < 0) - return false; - ok = false; + int err = 0; + int type = i < flags->n_iterations ? passarray[i] : 0; + + err = dopass (fd, &st, qname, &pass_size, type, rs, i + 1, pn); + + if (err) + { + ok = false; + if (err < 0) + goto wipefd_out; + } } } - /* Okay, now deallocate the data. The effect of ftruncate on + /* Now deallocate the data. The effect of ftruncate on non-regular files is unspecified, so don't worry about any errors reported for them. */ if (flags->remove_file && ftruncate (fd, 0) != 0 && S_ISREG (st.st_mode)) { error (0, errno, _("%s: error truncating"), qname); - return false; + ok = false; + goto wipefd_out; } +wipefd_out: + memset (passarray, 0, flags->n_iterations * sizeof (int)); + free (passarray); return ok; } -- 1.7.7.6
