On 07/27/2012 07:36 AM, Martin Carroll wrote: > a "used" value of 0 for small ascii files is technically within spec
That's not clear. The NFSv3 spec surely does not not grant permission to the server to (say) report a used count of zero at all times, claiming that this is technically within spec. But you're right that 'grep' should interoperate with these servers, so I pushed the following patch into the grep master. It'd be nice to generalize this to other apps but that's a bigger project. Thanks for the bug report. >From 2f0255e9f4cc5cc8bd619d1f217902eb29b30bc2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <[email protected]> Date: Fri, 27 Jul 2012 12:14:14 -0700 Subject: [PATCH] grep: don't falsely report tiny text files as binary * NEWS: Document this. * src/main.c (file_is_binary): When we are already at apparent EOF, skip the file-size check, as some servers use zero blocks to store binary files. Reported by Martin Carroll in <http://lists.gnu.org/archive/html/bug-grep/2012-07/msg00016.html>. --- NEWS | 5 +++++ src/main.c | 17 ++++++++++++----- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/NEWS b/NEWS index c7922ff..753aedc 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,11 @@ GNU grep NEWS -*- outline -*- * Noteworthy changes in release ?.? (????-??-??) [?] +** Bug fixes + + 'grep' no longer falsely reports tiny text files as being binary + on file systems that store tiny files' contents in metadata. + * Noteworthy changes in release 2.13 (2012-07-04) [stable] diff --git a/src/main.c b/src/main.c index dda7c9b..96e4f37 100644 --- a/src/main.c +++ b/src/main.c @@ -476,11 +476,18 @@ file_is_binary (char const *buf, size_t bufsize, int fd, struct stat const *st) represent its data, then it must have at least one hole. */ if (HAVE_STRUCT_STAT_ST_BLOCKS) { - off_t nonzeros_needed = st->st_size - cur + bufsize; - off_t full_blocks = nonzeros_needed / ST_NBLOCKSIZE; - int partial_block = 0 < nonzeros_needed % ST_NBLOCKSIZE; - if (ST_NBLOCKS (*st) < full_blocks + partial_block) - return 1; + /* Some servers store tiny files using zero blocks, so skip + this check at apparent EOF, to avoid falsely reporting + that a tiny zero-block file is binary. */ + off_t not_yet_read = st->st_size - cur; + if (0 < not_yet_read) + { + off_t nonzeros_needed = not_yet_read + bufsize; + off_t full_blocks = nonzeros_needed / ST_NBLOCKSIZE; + int partial_block = 0 < nonzeros_needed % ST_NBLOCKSIZE; + if (ST_NBLOCKS (*st) < full_blocks + partial_block) + return 1; + } } /* Look for a hole after the current location. */ -- 1.7.6.5
