Re: make check problems with coreutils 7.2 and earlier
Tim Mooney wrote: [I'm not subscribed to bug-coretuils, please Cc: me on any relevant replies] I'm building coreutils on x86_64-sun-solaris2.10, with the Sun Studio 12 and/or Express compilers. I'm building in 64 bit mode. Everything builds just fine, but before I finish packaging the software I always run make check, to see what problems might turn up and how make check from one version of the software differs from make check in previous versions. I haven't been able to successfully use make check with coreutils in a while. The problem appears to be that several of the tests are performed without the correct PATH, because instead of getting the version of the program that was compiled in src/, it's getting the version of a particular command that's part of the OS. This is especially problematic when testing yes. The Solaris version of yes doesn't support any of the command line arguments that the tests try, so the tests essentially hang while Solaris' yes loops continuously outputting things like --help. Thanks for the report. The tests are designed always to set PATH to make your shell use the just-built tools. If that's not happening for you, either the tool (yes) wasn't built or your shell is not honoring the PATH settings in the TESTS_ENVIRONMENT variable, as set in tests/check.mk. I suspect the latter. It'd be useful to know which shell configure chose for you. configure does try to find a sufficiently functional shell, but perhaps your system has managed to trick it. Please send (to this same list) the output of running your configure command, and the output from running make check. I'm configuring the software like this ./configure --prefix=/local/gnu --exec-prefix=/local/gnu \ --build x86_64-sun-solaris2.10 \ --sysconfdir=/etc/local/gnu --libdir=/local/gnu/lib/64 \ --mandir=/local/gnu/share/man --infodir=/local/gnu/share/info \ --localstatedir=/var/local/gnu \ --disable-nls and then doing gmake gmake check VERBOSE=yes (gmake is GNU make 3.81). I've checked the mailing list archives and I don't see any recent reports of this issue, though, and I would think others would have run into the same thing. Is this a known issue, or should I investigate further? We've seen similar reports, but not recently. Previous reports have been on rather old systems, with an inadequate /bin/sh and where configure failed to find a better shell. Also, the README that comes with coreutils has a section on Reporting bugs and what you should do when reporting them, but it doesn't actually That paragraph requests that you send in more details ;-) IMPORTANT: if you take the time to report a test failure, please be sure to include the output of running `make check' in verbose mode for each failing test. For example, if the test that fails is tests/misc/df, then you would run this command: (cd tests make check TESTS=misc/df VERBOSE=yes) log 21 For some tests, you can get even more detail by adding DEBUG=yes. Then include the contents of the file `log' in your bug report. include the address to report them to. People familiar with other GNU packages can probably guess what the address is, but it would still be good to actually include the address in the README. It's listed in the paragraph just below that one. But I've added another mention with the patch below: From 82169bbf6efeb9d60fc457cfd995c3ff0497365b Mon Sep 17 00:00:00 2001 From: Jim Meyering meyer...@redhat.com Date: Sat, 18 Apr 2009 09:17:04 +0200 Subject: [PATCH] doc: update README * README: (Reporting bugs): List the bug-reporting address here, too, not just in the following more test-oriented paragraph. Reported by Tim Mooney. All changes are no longer listed in version-controlled ChangeLog files, so note that contributions are attributed in the commit logs. Mention bootstrap.conf, now that it's the authoritative source of minimal prerequisite program/version# pairs. --- README | 14 ++ 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/README b/README index e7499e7..08e0bab 100644 --- a/README +++ b/README @@ -41,7 +41,7 @@ Special thanks to Paul Eggert, Brian Matthews, Bruce Evans, Karl Berry, Kaveh Ghazi, and François Pinard for help with debugging and porting these programs. Many thanks to all of the people who have taken the time to submit problem reports and fixes. All contributed changes are -attributed in the ChangeLog files. +attributed in the commit logs. And thanks to the following people who have provided accounts for portability testing on many different types of systems: Bob Proulx, @@ -173,6 +173,10 @@ run this command: For some tests, you can get even more detail by adding DEBUG=yes. Then include the contents of the file `log' in your bug report. +Send bug reports, questions, comments, etc. to bug-coreut...@gnu.org. +If you would like to suggest a patch, see
[PATCH] Fix some comment typos.
* bootstrap: Fix comment typos. * src/pr.c: Likewise. --- Hello Jim, all, a trivial patch. :-) Cheers, Ralf bootstrap |6 +++--- src/pr.c |4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/bootstrap b/bootstrap index 27e4ec2..73a850f 100755 --- a/bootstrap +++ b/bootstrap @@ -247,11 +247,11 @@ sort_ver() { #sort -V is not generally available echo $2 $1 break elif [ ! $p1 = $p2 ]; then - if [ $p1 -gt $p2 ] 2/dev/null; then #numeric comparision + if [ $p1 -gt $p2 ] 2/dev/null; then #numeric comparison echo $2 $1 - elif [ $p2 -gt $p1 ] 2/dev/null; then #numeric comparision + elif [ $p2 -gt $p1 ] 2/dev/null; then #numeric comparison echo $1 $2 - else #numeric, then lexographic comparison + else #numeric, then lexicographic comparison lp=$(printf $p1\n$p2\n | LANG=C sort -n | tail -n1) if [ $lp = $p2 ]; then echo $1 $2 diff --git a/src/pr.c b/src/pr.c index 3b6e801..f0910eb 100644 --- a/src/pr.c +++ b/src/pr.c @@ -73,7 +73,7 @@ The interference of the POSIX-compliant small letter options -w and -s: (`interference' means `setting a _separator_ with -s switches off the - column sturctur and the default - not generally - page_width, + column structure and the default - not generally - page_width, acts on -w option') options: text form / separator: equivalent new options: -w l -s[x] @@ -92,7 +92,7 @@ Options: Including version 1.22i: - Some SMALL LETTER options has been redefined with the object of a + Some SMALL LETTER options have been redefined with the object of a better POSIX compliance. The output of some further cases has been adapted to other UNIXes. A violation of downward compatibility has to be accepted. -- 1.6.1.505.gba743 ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: cp(1) fails to copy file from /proc
On Friday 17 April 2009 18:28:07 James Youngman wrote: On Fri, Apr 17, 2009 at 5:46 PM, Jim Meyering j...@meyering.net wrote: diff --git a/src/copy.c b/src/copy.c index 9b0e139..3cbeba4 100644 --- a/src/copy.c +++ b/src/copy.c @@ -699,10 +699,6 @@ copy_reg (char const *src_name, char const *dst_name, goto close_src_and_dst_desc; } last_write_made_hole = false; - - /* A short read on a regular file means EOF. */ - if (n_read != buf_size S_ISREG (src_open_sb.st_mode)) - break; } } The patch itself looks good, but it might be worth leaving in a comment indicating why the optimisation should not be reintroduced... and/or a new test (i prefer the and): if [ -e /proc/cpuinfo ] ; then cp /proc/cpuinfo cpuinfo.cp cat /proc/cpuinfo cpuinfo.cat cmp cpuinfo.cp cpuinfo.cat fi -mike signature.asc Description: This is a digitally signed message part. ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: cp(1) fails to copy file from /proc
Mike Frysinger wrote: On Friday 17 April 2009 18:28:07 James Youngman wrote: ... The patch itself looks good, but it might be worth leaving in a comment indicating why the optimisation should not be reintroduced... and/or a new test (i prefer the and): if [ -e /proc/cpuinfo ] ; then cp /proc/cpuinfo cpuinfo.cp cat /proc/cpuinfo cpuinfo.cat cmp cpuinfo.cp cpuinfo.cat fi Of course ;-) As promised, I've added a test for this below. We can't use /proc/cpuinfo, at least not precisely like that, because its cpu speed line can change due to frequency scaling. Also, that file is usually too small to trigger the failure. Here's a more complete patch, with a title and NEWS reflecting that I now think it's a linux kernel bug. I'll wait a few days before pushing, in case I learn otherwise. From a248490b206eca42b9018e596f1c7a234566838a Mon Sep 17 00:00:00 2001 From: Jim Meyering meyer...@redhat.com Date: Fri, 17 Apr 2009 18:44:18 +0200 Subject: [PATCH] cp: work around linux kernel bug: short-read != EOF on /proc Remove the optimization that avoided up to 50% of cp's read syscalls. Do not assume that a short read on a regular file indicates EOF. When reading from a file in /proc on linux [at least 2.6.9 - 2.6.29] into a 4k-byte buffer or larger, a short read does not always indicate EOF. For example, cp /proc/slabinfo /tmp copies only 4068 of the total 7493 bytes. This optimization (25719a33154f0c62ea9881f0c79ae312dd4cec7a, Improve performance a bit by optimizing away; 2005-11-24) appears to have been worth less than a 2% speed-up (and usually much less), so the impact of removing it is negligible. * src/copy.c (copy_reg): Don't exit the loop early. * tests/cp/proc-short-read: New test, lightly based on a suggestion from Mike Frysinger, to exercise this fix. * tests/Makefile.am (TESTS): Add cp/proc-short-read. * NEWS (Improve robustness): Mention this change. --- NEWS | 12 src/copy.c |7 --- tests/Makefile.am|1 + tests/cp/proc-short-read | 45 + 4 files changed, 62 insertions(+), 3 deletions(-) create mode 100755 tests/cp/proc-short-read diff --git a/NEWS b/NEWS index 5951bb5..18db634 100644 --- a/NEWS +++ b/NEWS @@ -16,6 +16,18 @@ GNU coreutils NEWS-*- outline -*- default should proceed at the speed of the disk. Previously /dev/urandom was used if available, which is relatively slow on GNU/Linux systems. +** Improved robustness + + cp would exit successfully after copying less than the full contents + of a file larger than ~4000 bytes from a linux-/proc file system to a + destination file system with a fundamental block size of 4KiB or greater. + Reading into a 4KiB-or-larger buffer, cp's read syscall would return + a value smaller than 4096, and cp would interpret that as EOF (POSIX + allows this). This optimization, now removed, saved 50% of cp's read + syscalls when copying small files. Affected linux kernels: at least + 2.6.9 through 2.6.29. + [the optimization was introduced in coreutils-6.0] + ** Portability `id -G $USER` now works correctly even on Darwin and NetBSD. Previously it diff --git a/src/copy.c b/src/copy.c index 9b0e139..c45224c 100644 --- a/src/copy.c +++ b/src/copy.c @@ -700,9 +700,10 @@ copy_reg (char const *src_name, char const *dst_name, } last_write_made_hole = false; - /* A short read on a regular file means EOF. */ - if (n_read != buf_size S_ISREG (src_open_sb.st_mode)) - break; +/* It is tempting to return early here upon a short read from a + regular file. That would save the final read syscall for each + file. Unfortunately that doesn't work for certain files in + /proc with linux kernels from at least 2.6.9 .. 2.6.29. */ } } diff --git a/tests/Makefile.am b/tests/Makefile.am index 8ce6a21..c02f2de 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -281,6 +281,7 @@ TESTS = \ cp/parent-perm-race \ cp/perm \ cp/preserve-2\ + cp/proc-short-read \ cp/proc-zero-len \ cp/r-vs-symlink \ cp/same-file \ diff --git a/tests/cp/proc-short-read b/tests/cp/proc-short-read new file mode 100755 index 000..e06143c --- /dev/null +++ b/tests/cp/proc-short-read @@ -0,0 +1,45 @@ +#!/bin/sh +# exercise cp's short-read failure when operating on 4KB files in /proc + +# Copyright (C) 2009 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software
Re: cp(1) fails to copy file from /proc
On Saturday 18 April 2009 16:58:52 Jim Meyering wrote: Mike Frysinger wrote: On Friday 17 April 2009 18:28:07 James Youngman wrote: The patch itself looks good, but it might be worth leaving in a comment indicating why the optimisation should not be reintroduced... and/or a new test (i prefer the and): if [ -e /proc/cpuinfo ] ; then cp /proc/cpuinfo cpuinfo.cp cat /proc/cpuinfo cpuinfo.cat cmp cpuinfo.cp cpuinfo.cat fi Of course ;-) As promised, I've added a test for this below. We can't use /proc/cpuinfo, at least not precisely like that, because its cpu speed line can change due to frequency scaling. Also, that file is usually too small to trigger the failure. hmm, i seem to remember cpuinfo being a pita when doing remote `ssh cp ...`, but must have been some other file. you got the idea at any rate :). Here's a more complete patch, with a title and NEWS reflecting that I now think it's a linux kernel bug. i think that characterization is incorrect. POSIX certainly states that a read value smaller than what was requested is not an error in and of itself. i havent looked at the kernel internals as to why this occurs, but if i had to make a guess, it's related to the seq_file stuff as a common helper in place of many common buggy procfs implementations. and in that case, the current behavior is to be expected. -mike signature.asc Description: This is a digitally signed message part. ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: [PATCH] sort: Add --threads option, which parallelizes internal sort.
Hello Paul, Glen, Jim, all, * Paul Eggert wrote on Fri, Apr 03, 2009 at 09:57:54PM CEST: Of course we cannot reasonably expect this one performance improvement to make 'sort' run 16x faster on a 16-CPU machine. That is because the improvement parallelizes only one part of 'sort'. Even assuming the parallelized part gets done in essentially zero time, the rest of 'sort' is still sequential (notably, input and output), and so it will dominate the wall-clock time. Obviously we would like to parallelize all of 'sort', but the idea is to do one step at a time, focusing on the easiest parts first, and showing real improvements at each step. I would be quite happy if this particular improvement gave us a 2x wall-clock improvement. I've looked at this in a bit more detail; no big conclusion but maybe a few more hints that could help. I am now pretty confident that your patch implements the threading correctly. When inserting some expensive_computation (); in the sortlines function right after `swap' is computed, then all timings look good (in relation to each other). For the expensive computation, I used a function in a separate translation unit doing some floating point, and compiled that without optimization. 'valgrind --cachegrind' shows a very low cache miss rate of 0.2% max with both 1 and 2 threads, indicating there isn't anything funny going on here. So I think the only remaining issues that could be relevant for the original timings are, IMHO: - false data sharing. Don't see where that could happen, - memory bandwidth limitations of the specific system, or - suboptimal NUMA memory distribution, That said, I did manage to get 'valgrind --tool=helgrind sort --threads=2' to show a potential error, with a large data set. I'm quite sure it is a false positive, but I post the output below for reference. The 'valgrind --tool=drd' tool could not reproduce this, though. Anyway. I tried another experiment. I replicated some large data set 8 times. I then timed a manual merge: sort -m ( sort data ) ( sort data ) ... (8 times) /dev/null and compared that with: sort --threads=P 8-fold-data manual merge: 16.75 s --threads=8: 44.11 s --threads=1: 81.32 s This comparison isn't entirely fair, as the splicing was done as a precomputation. However, the difference is so pronounced that even taking the splicing into account, the non-thread version would be quite a bit faster. So to me, it would seem that some approach going in that direction could be beneficial. Other than that, have you thought of implementing something like described in http://www.it.uom.gr/teaching/dbpp/text/node127.html? Cheers, Ralf This is the (non-portable) diff I used when running the command below. With it, sort.c:2608 is the last sortlines invocation within sortlines. diff --git a/src/sort.c b/src/sort.c index 82aa7e8..c0a0462 100644 --- a/src/sort.c +++ b/src/sort.c @@ -1,5 +1,5 @@ /* sort - sort lines of text (with all kinds of options). - Copyright (C) 1988, 1991-2008 Free Software Foundation, Inc. + Copyright (C) 1988, 1991-2009 Free Software Foundation, Inc. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -2520,6 +2520,7 @@ sortlines_thread (void *data) struct thread_args const *args = data; sortlines (args-lines, args-nlines, args-temp, args-nthreads, args-to_temp); + fprintf(stderr, thread %lu done\n, pthread_self ()); return NULL; } @@ -2540,7 +2541,7 @@ sortlines_thread (void *data) usual 2*N lines. Knuth writes that this memory optimization was originally published by D. A. Bell, Comp J. 1 (1958), 75. - This function is inline so that its tests for multthreadedness and + This function is inline so that its tests for multithreadedness and inplacedness can be optimized away in common cases. */ static void @@ -2554,6 +2555,9 @@ sortlines (struct line *restrict lines, size_t nlines, http://lists.gnu.org/archive/html/bug-coreutils/2005-10/msg00086.html in the IBM xlc 6.0.0.0 compiler in 64-bit mode. */ int swap = (0 compare (lines[-1], lines[-2])); + /* let's make this more expensive, artificially */ + //int expensive_computation (int); + //swap = expensive_computation (swap); if (to_temp) { temp[-1] = lines[-1 - swap]; @@ -2561,9 +2565,9 @@ sortlines (struct line *restrict lines, size_t nlines, } else if (swap) { - temp[-1] = lines[-1]; + struct line tmp = lines[-1]; lines[-1] = lines[-2]; - lines[-2] = temp[-1]; + lines[-2] = tmp; } } else @@ -2579,16 +2583,26 @@ sortlines (struct line *restrict lines, size_t nlines, struct thread_args args = {hi, nhi, temp - nlo, child_subthreads, to_temp}; if (child_subthreads != 0 SUBTHREAD_LINES_HEURISTIC = nlines + fprintf(stderr,
Re: cp(1) fails to copy file from /proc
Mike Frysinger wrote: On Saturday 18 April 2009 16:58:52 Jim Meyering wrote: Mike Frysinger wrote: On Friday 17 April 2009 18:28:07 James Youngman wrote: The patch itself looks good, but it might be worth leaving in a comment indicating why the optimisation should not be reintroduced... and/or a new test (i prefer the and): if [ -e /proc/cpuinfo ] ; then cp /proc/cpuinfo cpuinfo.cp cat /proc/cpuinfo cpuinfo.cat cmp cpuinfo.cp cpuinfo.cat fi Of course ;-) As promised, I've added a test for this below. We can't use /proc/cpuinfo, at least not precisely like that, because its cpu speed line can change due to frequency scaling. Also, that file is usually too small to trigger the failure. hmm, i seem to remember cpuinfo being a pita when doing remote `ssh cp ...`, but must have been some other file. you got the idea at any rate :). Here's a more complete patch, with a title and NEWS reflecting that I now think it's a linux kernel bug. i think that characterization is incorrect. POSIX certainly states that a read value smaller than what was requested is not an error in and of itself. Sure, but that's not the question. The question is whether we can assume short-read-on-regular-file implies EOF. The spec does say when a short read may occur, and this case (S_ISREG and no interrupt), is not one of them. If the list of cases in that may clause is intended to be complete, then I see no other interpretation. i havent looked at the kernel internals as to why this occurs, but if i had to make a guess, it's related to the seq_file stuff as a common helper in place of many common buggy procfs implementations. and in that case, the current behavior is to be expected. Whether it's officially a kernel bug depends mainly on how we interpret the POSIX spec for read. ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils