better i18n for join, uniq, etc.

2023-10-30 Thread Paul Eggert
p; c_isxdigit (to_uchar (b[1]))) { int esc_value = hextobin (b[1]); /* Value of \xhh escape. */ /* A hexadecimal \xhh escape sequence must have 1 or 2 hex. digits. */ ++b; - if (isxdigit (to_uchar (b[1]))) + if (c_isxd

Re: sort dynamic linking overhead

2024-02-26 Thread Paul Eggert
tions, I didn't see where libcrypto (at least on Ubuntu 23.10, which has OpenSSL 3.0.10) takes advantage of these special-purpose instructions.From 7f57ac2d20c144242953a8dc7d95b02df0244751 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sun, 25 Feb 2024 17:13:12 -0800 Subject: [PATCH] so

Re: sort dynamic linking overhead

2024-02-26 Thread Paul Eggert
On 2024-02-26 06:12, Pádraig Brady wrote: On 26/02/2024 06:44, Yann Collet wrote:   * xxhash128 is not a cryptographic hash function, so it doesn't attempt tobe random. Just a correction : xxh128 does try to be random. And quite hardly: a significant amount of development is spent on ensuring

Re: sort dynamic linking overhead

2024-02-27 Thread Paul Eggert
Thanks for the patch. I was hoping that we didn't need to worry about older platforms needing -ldl. Oh well. The patch causes 'configure' to search for dlopen even when there's no crypto library. 'configure' could instead use AC_SEARCH_LIBS only if the AC_LINK_IFELSE fails (or simply put AC_LI

Re: coreutils-9.4.170-7b206 ls/removed-directory test failure

2024-03-26 Thread Paul Eggert
On 3/26/24 10:50, Pádraig Brady wrote: It seems that readdir() on FreeBSD 14 is _not_ eating the ENOENT from getdirentries(). Attached is an extra check for that, that avoids the test in that case. An alternative would be for ls to ignore the ENOENT from readdir(). Paul what do you think? My

Re: coreutils-9.4.170-7b206 ls/removed-directory test failure

2024-03-26 Thread Paul Eggert
On 3/26/24 11:35, Pádraig Brady wrote: Actually the FreeBSD system ls(1) does _not_ show the error in this case (it doesn't use readdir I think). If I'm reading the source code aright, FreeBSD ls uses fts, which does use readdir. It's not clear to me whether it's intended that the readdir err

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-04 Thread Paul Eggert
On 09/04/2012 08:02 AM, Jim Meyering wrote: > I'm not 100% sold on the idea that that final unlinkat > call should be creating a dangling symlink (i.e., by removing > the directory to which "s" points). But that's what rmdir() and the rmdir command are supposed to do. That much, at least, is pret

bug#12366: [gnu-prog-discuss] Writing unwritable files

2012-09-06 Thread Paul Eggert
On 09/06/2012 05:12 AM, Paolo Bonzini wrote: > I consider "shuf foo -o foo" (on a read-write file) to be insecure. > Besides, it works by chance It's not by chance. shuf is designed to let you shuffle a file in-place, and is documented to work, by analogy with "sort -o foo foo". If we ever chang

bug#12366: [gnu-prog-discuss] Writing unwritable files

2012-09-06 Thread Paul Eggert
>> If some other process is writing F >> while I run 'sed -i F', F is not replaced atomically. > How not so? For example: echo ac >f sed -i 's/a/b/' f & sed -i 's/c/d/' f wait cat f If 'sed' were truly atomic, then the output of this would always be 'bd'. But it's not.

[PATCH] df: port the new df test to POSIX sed, larger file systems

2012-11-09 Thread Paul Eggert
* tests/df/df-output.sh: For the test "df -B1K --output=size", do not assume that the file system size fits in 9 bytes; it might be larger than that, so omit leading space. Also, use portable 'sed' commands: POSIX says sed commands inside { } should all end in newline. --- tests/df/df-output.sh |

Re: FYI: updated to latest gnulib --- almost, but not quite

2012-11-09 Thread Paul Eggert
On 11/08/2012 11:36 PM, Jim Meyering wrote: > I saw failures on 2 of 3 "make -j5 check" runs on > an old 2-core F18/i686+SSD/ext4. But when I revert the last > two changes to gnulib's tests/nap.h, I saw 10 successes in a row, > so I'd revert those if I had more time now. Feel free to revert them;

Re: [PATCH] df: port the new df test to POSIX sed, larger file systems

2012-11-09 Thread Paul Eggert
On 11/09/2012 01:50 AM, Bernhard Voelker wrote: > What about simplifying the first s/... to eliminate all blanks? Yes, could do that, if someone has the energy

[PATCH] doc: explain why dd is called "dd"

2012-11-17 Thread Paul Eggert
I pushed this hoping that it may help forestall questions about why dd is so, ahem, *unusual* * doc/coreutils.texi (dd invocation): Mention JCL. --- doc/coreutils.texi | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 7f8c0d1.

[PATCH] doc: sync parse-datetime from gnulib

2013-01-06 Thread Paul Eggert
I pushed this: * doc/coreutils.texi (Top): Sync from gnulib parse-datetime.texi menu. --- doc/coreutils.texi | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 60096af..45a4b3d 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.

[PATCH] build: update gnulib submodule to latest

2013-01-23 Thread Paul Eggert
* bootstrap.conf (gnulib_modules): Add statat. The fstatat module was split in two, and we need both halves. --- bootstrap.conf | 1 + gnulib | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/bootstrap.conf b/bootstrap.conf index d575949..bb6c145 100644 --- a/bootstrap.

Re: [PATCH] quotearg: do not read beyond end of buffer

2013-05-13 Thread Paul Eggert
On 05/12/2013 10:14 PM, Jim Meyering wrote: > I ran gcc's -fsanitize=address against coreutils, and two > sort tests failed due to buffer overruns. Both arose via > a bug in quotearg.c. Patch below. Two things remain to do: > 1) find when the bug was introduced (before push) > 2) address the

Re: coretutils package produces the ownership issue during "mv" command execution.

2013-05-14 Thread Paul Eggert
On 05/14/13 02:15, Koteswararao Nelakurthi wrote: > lchown("/media/sda2/aaa", 0, 0) = 0 > write(2, "mv: ", 4mv: ) = 4 > write(2, "failed to preserve ownership for"..., 48failed to preserve > ownership for /media/sda2/aaa) = 48 > write(2, ": Function not implemented", 26

[PATCH] tests: don't assume expr was built with GMP

2013-05-18 Thread Paul Eggert
* tests/misc/cut-huge-range.sh (subtract_one): New string. (CUT_MAX): Don't pass a too-large integer to 'expr'. --- tests/misc/cut-huge-range.sh | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/tests/misc/cut-huge-range.sh b/tests/misc/cut-huge-range.sh i

[PATCH 1/2] build: update gnulib submodule to latest

2013-05-18 Thread Paul Eggert
--- gnulib | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gnulib b/gnulib index cda5c90..1233589 16 --- a/gnulib +++ b/gnulib @@ -1 +1 @@ -Subproject commit cda5c90820d55b4b1f52d6a6f5329a10668bd720 +Subproject commit 12335899d0089131d854aa1b074f0c4d841dff42 -- 1.7.11.7

[PATCH 2/2] maint: port --enable-gcc-warnings to clang

2013-05-18 Thread Paul Eggert
* configure.ac: If clang, add -Wno-format-extra-args and -Wno-tautological-constant-out-of-range-compare. * gl/lib/rand-isaac.c (ind): * gl/lib/randread.c (readisaac): * src/ls.c (dev_ino_push, dev_ino_pop): * src/sort.c (buffer_linelim): * src/system.h (is_nul): * src/tail.c (tail_forever_inotify)

Re: bug#13530: head: memory exhausted when printing all from stdin but last P/E bytes

2013-05-27 Thread Paul Eggert
On 05/27/2013 05:07 PM, Jim Meyering wrote: > +max_BUFSIZ=$(expr 256 '*' 1024) > +lim=$(expr $SIZE_MAX - $max_BUFSIZ) Can't this code fail, due to overflow, on non-GMP hosts? See: http://lists.gnu.org/archive/html/coreutils/2013-05/msg00060.html and look for "$SIZE_MAX".

Re: bug#13530: head: memory exhausted when printing all from stdin but last P/E bytes

2013-05-27 Thread Paul Eggert
On 05/27/2013 06:04 PM, Jim Meyering wrote: > +lim=$(echo $SIZE_MAX | subtract_one_) > +lim=$(expr $lim - $max_BUFSIZ) Sorry, I don't see how this will work either. It's common for a GMP-less expr to handle values only up to SIZE_MAX / 2, and subtracting just 1 won't work around that problem. May

bug#15828: behavior of ls -f

2013-11-07 Thread Paul Eggert
On 11/07/2013 11:57 AM, Pádraig Brady wrote: > I don't see a need for -f to ignore any -l One could argue that -f should disable -l if _POSIX2_VERSION is 200112 or earlier, I'm not sure it's worth the hassle to implement that, though.

Re: [PATCH] use libcrypto routines in gnulib

2013-12-02 Thread Paul Eggert
On 12/02/2013 06:12 AM, Pádraig Brady wrote: > To use this from coreutils I configure with --with-openssl > and add in the appropriate libs as follows. > Note since the new libs are required, then is one of the reasons > I didn't enable this by default. A related question though > is I'd like core

Re: [PATCH] use libcrypto routines in gnulib

2013-12-02 Thread Paul Eggert
On 12/02/2013 01:05 PM, Pádraig Brady wrote: > each project would have > to add LIB_CRYPTO_MD5 etc. to their list of libs similarly > to the coreutils patch I had inline in my previous mail. Thanks for explaining. I tried that for Emacs and came up with the patch appended to this message. Unfort

Re: [PATCH] use libcrypto routines in gnulib

2013-12-02 Thread Paul Eggert
Thanks, that works for me; please install it into gnulib when you have the time. I do have some minor stylistic suggestions, but they're not crucial: +GL_OPENSSL_INLINE void +GL_CRYPTO_FN(_init_ctx) (struct _gl_ctx *ctx) + { (void) OPENSSL_FN(_Init) ((_gl_CTX *) ctx); } Space before pa

Re: [PATCH] use libcrypto routines in gnulib

2013-12-02 Thread Paul Eggert
Paul Eggert wrote: > + { (void) OPENSSL_FN(_Init) ((_gl_CTX *) ctx); } Also, please put that open curly brace at the start of the line, as that part of the GNU coding standards.

Re: [PATCH] use libcrypto routines in gnulib

2013-12-02 Thread Paul Eggert
Pádraig Brady wrote: > Seems the handiest way to do this is to do the following > in configure.ac before gl_INIT: > > dnl Enable use of libcrypto by default > AC_ARG_WITH([openssl], > [AS_HELP_STRING([--with-openssl], > [use libcrypto hash routines if available: default=yes])], > [], > [

Re: [PATCH] use libcrypto routines in gnulib

2013-12-03 Thread Paul Eggert
On 12/03/2013 05:45 AM, Pádraig Brady wrote: > I'll probably do this in coreutils configure.ac before gl_INIT, > so as to at least set the default as coreutils wants and caters for, > and allowing users to --without-openssl if they want. > > dnl Enable use of libcrypto by default > AS_VAR_SET_IF([w

Re: [PATCH] md5sum, sha*sum: use libcrypto where available

2013-12-07 Thread Paul Eggert
m4/gl-openssl.m4 | 13 ++--- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/ChangeLog b/ChangeLog index 5d935ad..9688c32 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,14 @@ 2013-12-07 Paul Eggert + md5, sha1, sha256, sha512: add gl_SET_CRYPTO_CHECK_D

Re: [PATCH] use libcrypto routines in gnulib

2013-12-08 Thread Paul Eggert
Pádraig Brady wrote: > * m4/gl-openssl.m4 (gl_CRYPTO_CHECK): Don't empty LIB_CRYPTO That would inherit LIB_CRYPTO from the environment, no? It might be better to move the LIB_CRYPTO= into the initialization code. Also, is libgcrypt compatible with libcrypt with respect to MD5, SHA512, etc.?

Re: [PATCH] use libcrypto routines in gnulib

2013-12-08 Thread Paul Eggert
Pádraig Brady wrote: > Where would be best to initialize this? Maybe m4_divert_once([DEFAULTS], [LIB_CRYPTO=])? > The libgcrypt replacement calling out to libcrypto seems to work. > tests pass anyway. I assume you're preparing a gnulib patch that would prefer libgcrypt to libcrypto, or somethin

Re: [heads-up] patch re. savedir() in src/copy.c needed when updating gnulib

2014-02-26 Thread Paul Eggert
Thanks, I pushed the attached coreutils patches to fix that. I figure if sorting by inode helps tar it'll help cp too, so I used that. >From a22cd3f3514aa9d0c03a55627e0b79aa45bf8ac3 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 26 Feb 2014 11:22:16 -0800 Subject: [PATCH 1/

Re: [heads-up] patch re. savedir() in src/copy.c needed when updating gnulib

2014-02-27 Thread Paul Eggert
UI change -- Sergey, what do you think? From 5bdd09b4b5246d852b63455f1d629f38be115bf9 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 26 Feb 2014 23:57:26 -0800 Subject: [PATCH 1/2] savedir: new symbol for fast-read version * lib/savedir.h (SAVEDIR_SORT_FASTREAD): New symbol, for program

[PATCH] stat: port birthtime to Solaris 11

2014-03-18 Thread Paul Eggert
Problem reported by Rich Burridge. * src/stat.c [HAVE_GETATTRAT]: Include , . (print_statfs, print_stat, print_it): Pass fd, too, for the benefit of get_birthtime. All uses changed. (get_birthtime): New function, for porting to Solaris 11. (print_stat): Use it. * configure.ac (getattrat, LIB_NVPAI

Re: [PATCH] shred: overwrite inode storage used by some file systems

2014-04-04 Thread Paul Eggert
+ else if (S_ISREG (st.st_mode)) +{ + off_t fsize = st.st_size; + if (fsize > 0 && fsize < ST_BLKSIZE (st) && size > fsize) +i_size = fsize; +} This can be simplified. There's no need to worry about checking whether st.st_size == 0 since the code would do the right t

[PATCH] doc: use nicer quotes

2014-05-24 Thread Paul Eggert
* doc/coreutils.texi: Add "@documentencoding UTF-8". --- doc/coreutils.texi | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 592f4a6..a6dd075 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -2,6 +2,7 @@ @c %**start of header @setfilena

Re: [PATCH] doc: use nicer quotes

2014-05-25 Thread Paul Eggert
Pádraig Brady wrote: I didn't notice any quoting changes in the generated info or pdf file at least. I think it depends on the Texinfo version. I'm using texinfo 5.2, the latest stable version. UTF-8 should be used where possible these days. Should 'make dist' check that the documentatio

Re: [PATCH] gettext: update macros to version 0.19

2014-07-09 Thread Paul Eggert
On 07/09/2014 04:49 PM, Pádraig Brady wrote: So I'm proposing the attached update for coreutils Looks good to me; thanks.

Re: printf-safe checks of invalid long double values

2014-11-27 Thread Paul Eggert
Pádraig Brady wrote: Are these checks backed up by corresponding replacement code? Are these checks correct? Why has glibc not been updated in the 7 years since the checks were added? As I recall, this comes from an old dispute about what glibc should do when asked to print floating-p

Re: printf-safe checks of invalid long double values

2014-11-28 Thread Paul Eggert
Pádraig Brady wrote: 3. Since glibc no longer crashes, and no-one has complained about these edge cases of invalid numbers, just avoid this replacement altogether but push for the improvement to output "nan" in these cases in glibc. Thanks, I like this option the best.

Re: Z and Y suffixes in xstrtol.c

2014-12-16 Thread Paul Eggert
On 12/16/2014 05:46 AM, Pádraig Brady wrote: if (xstrtoul (optarg, NULL, 10, &val, "") != LONGINT_OK -|| MIN (INT_MAX, SIZE_MAX) < val) - error (EXIT_FAILURE, 0, _("%s: invalid number"), optarg); +|| (MIN (INT_MAX, SIZE_MAX) < val && (errno = EOVERFLOW))

[bug #19546] mkdir -p should use default ACL for parent directories

2015-03-09 Thread Paul Eggert
Update of bug #19546 (project coreutils): Status:None => Fixed Open/Closed:Open => Closed ___ Follow-up Comment #4: This was fixed as

Re: coreutils + maint.mk + public-submodule-commit

2015-04-09 Thread Paul Eggert
On 04/09/2015 11:20 AM, Andreas Grünbacher wrote: Sounds like the Savannah server should be be checking that for all its projects that use gnulib. No, because some projects deliberately don't want this check. Emacs is one (gnulib imports are copied into the Emacs repository), Tar is another

Re: coreutils + maint.mk + public-submodule-commit

2015-04-09 Thread Paul Eggert
Andreas Grünbacher wrote: This behavior is discouraging testing; I find that quite annoying. Me too. It's bitten me several times and I can never remember how to shut it off. Perhaps we should remove that test from 'make check' and have a further rule 'make checker' that is stronger than '

bug#20667: [GNULIB v2 1/2] file-has-acl: Split feature tests again

2015-05-27 Thread Paul Eggert
I found one nit: +AC_CHECK_HEADERS([linux/xattr.h]) +AC_CHECK_HEADERS([sys/xattr.h], + [AC_CHECK_FUNCS_ONCE([getxattr])]) This is missing _ONCE and non-ONCE calls, which doesn't work as expected. Simplest fix is to replace AC_CHECK_FUNCS_ONCE with AC_CHECK_FUNCS.

bug#20666: [GNULIB v2 2/2] qacl: Reimplement qset_acl and qcopy_acl

2015-05-27 Thread Paul Eggert
On 05/26/2015 01:53 PM, Andreas Gruenbacher wrote: --- lib/acl-internal.c| 30 ++ This one is missing a patch to ChangeLog. Please put the commit message into the ChangeLog. Also, please put the string "Bug#20666" somewhere into the commit message body and the ChangeLog (they should

[coreutils] Re: [PATCH]: ls: do not show long iso time format for en_* locales

2010-06-30 Thread Paul Eggert
[Sorry if this is a duplicate; my first attempt to send this flopped I think.] >>> * There are more users in non-English locales than in non-"C" English >>> locales, and the harm in the non-English case (incomprehensible >>> dates) is much greater than the harm in the English case >>> (compr

bug#6524: du now uses less than half as much memory, sometimes

2010-07-02 Thread Paul Eggert
copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +/* written by Paul Eggert and Jim Meyering */ + +#include +#include "di-set.h" + +#include "hash.h" +#include "ino-map.h" + +#include +#include + +/

[coreutils] Re: suggested cmp optimization for sparse files

2010-07-23 Thread Paul Eggert
On 07/23/10 09:13, Eric Blake wrote: > http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00100.html > > I think porting cmp to use this would be a nice optimization Thanks, good suggestion! Also, "diff" should do it.

[coreutils] [PATCH] sort: don't assume ASCII when parsing K, M, G suffixes

2010-07-26 Thread Paul Eggert
* src/sort.c (find_unit_order): Don't assume ASCII. --- src/sort.c |6 +- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/src/sort.c b/src/sort.c index 577521d..1fd4ce7 100644 --- a/src/sort.c +++ b/src/sort.c @@ -1818,7 +1818,11 @@ find_unit_order (char const *number, struc

[coreutils] [PATCH] sort: fix bug with EOF at buffer refill

2010-07-26 Thread Paul Eggert
* src/sort.c (fillbuf): Don't append eol unless the line is nonempty. This fixes a bug that was partly but not completely fixed by the aadc67dfdb47f28bb8d1fa5e0fe0f52e2a8c51bf commit (dated July 15). * tests/misc/sort (realloc-buf-2): New test, which catches this bug on 64-bit hosts. --- src/sort.

[coreutils] [PATCH] sort: fix --debug display with very large offsets

2010-07-27 Thread Paul Eggert
* src/sort.c (mark_key): Don't assume offset <= INT_MAX. Make the code a bit clearer when width != 0. --- src/sort.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/src/sort.c b/src/sort.c index 588bae8..f552d21 100644 --- a/src/sort.c +++ b/src/sort.c @@ -2162,14 +2

Re: [coreutils] [PATCH] sort: fix --debug display with very large offsets

2010-07-27 Thread Paul Eggert
On 07/27/10 13:29, Eric Blake wrote: > I'd rather see something along the lines of: > > while (INT_MAX < offset) > { > printf ("%*s", INT_MAX, ""); > offset -= INT_MAX; > } > printf ("%*s", (int) offset), ""); That'd be fine too. It's only used during debugging, and there is a simil

[coreutils] [PATCH] sort: -h now handles comparisons such as 6000K vs 5M and 5MiB vs 5MB

2010-07-30 Thread Paul Eggert
installed the following: >From ab94b1fda7a994e97fed8f4c90872f508be5cd73 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 30 Jul 2010 01:52:59 -0600 Subject: [PATCH] sort: -h now handles comparisons such as 6000K vs 5M and 5MiB vs 5MB * NEWS: Document changes to sort -h. *

Re: [coreutils] [PATCH] sort: -h now handles comparisons such as 6000K vs 5M and 5MiB vs 5MB

2010-07-30 Thread Paul Eggert
On 07/30/10 05:06, Pádraig Brady wrote: > Perhaps since strtold() is so heavy weight anyway, > we could strip commas first? Yes, that's easily doable; I'll look into that soon. By the way, my ulterior motive for doing this patch is to fix up the rat's nest of code buried deeply inside the core co

Re: [coreutils] [PATCH] sort: -h now handles comparisons such as 6000K vs 5M and 5MiB vs 5MB

2010-08-02 Thread Paul Eggert
ady treated. Here's what I installed: >From 90feb6380b581f81935d66ac21f2c889e1a5ac8b Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 2 Aug 2010 19:18:01 -0700 Subject: [PATCH] sort: revert recent -h changes and use a more-conservative approach * NEWS: Document changes to sort -h,

[coreutils] [PATCH] sort: fix bug in --debug when \0 is followed by \t

2010-08-03 Thread Paul Eggert
* src/sort.c (debug_width): New function, which does not stop counting tabs at \0, and also invokes mbsnwidth. Stamp out strnlen! (count_tabs): Remove. (debug_key): Use debug_width instead of mbsnwidth and count_tabs. * tests/misc/sort-debug-keys: Check that \0 and \t intermix. --- src/sort.c

[coreutils] [PATCH] init.sh: work around trap limitation of some shells

2010-08-03 Thread Paul Eggert
Hmm, maybe bootstrap should copy gnulib-tests/init.sh to tests/init.sh so that we don't have to do this stuff by hand? >From 13fbe90cf4008ae30625bcfd857201963ff273d5 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 3 Aug 2010 13:01:16 -0700 Subject: [PATCH] init.sh: work aro

bug#6789: propose renaming gnulib memxfrm to amemxfrm (naming collision with coreutils)

2010-08-03 Thread Paul Eggert
On 2009-03-07 Bruno Haible wrote: > Paul Eggert has written the module 'memcoll', which generalizes the 'strcoll' > function to work on strings with embedded NULs. > Here is the generalization of 'strxfrm' to strings with embedded NUL bytes. Sorry, I did

[coreutils] [PATCH] sort: -R now uses less memory on long lines with internal NULs

2010-08-04 Thread Paul Eggert
. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program. If not, see <http://www.gnu.org/licenses/>. */ - -/* Written by Paul Eggert . */ - -#include - -#include "memxfrm.h" - -#inc

[coreutils] [PATCH] sort: tune and refactor --debug code, and fix minor underlining bug

2010-08-05 Thread Paul Eggert
Formerly, the 'compare' function and some of its subroutines had a debugging flag, which caused them to output underlines. This change refactors the code so that debugging output is more-separated from the actual sorting. In the process, the change fixes a minor error in the debugging output. Th

Re: [coreutils] [PATCH] sort: tune and refactor --debug code, and fix minor underlining bug

2010-08-05 Thread Paul Eggert
On 08/05/10 13:46, Eric Blake wrote: > Hmm - we document that '1M' and 'M' are synonyms in the 'Block size' > node of the manual They are synonymous in the operand to --block-size, but that's a different syntax. For example, --block-size="'1kB" is also documented there, but "'1kB" is not a valid

[coreutils] [PATCH] sort: support all combinations of -d, -f, -i, -R, and -V

2010-08-06 Thread Paul Eggert
* NEWS: Document this. * src/sort.c (getmonth): Omit LEN arg, as MONTH is now null-terminated. (compare_random): Don't null-terminate keys, as caller now does that. (compare_version): Remove. (debug_key): Null-terminate string for getmonth. (keycompare): Support combining -R with any of -d, -f, -i,

Re: [coreutils] [PATCH] sort: fix bug in --debug when \0 is followed by \t

2010-08-08 Thread Paul Eggert
On 08/08/10 17:41, Pádraig Brady wrote: > Are there other reasons for not using strnlen() > apart from it not dealing with NULs in the input? Generally speaking, a data structure where strnlen makes sense is a data structure that is probably poorly designed. strnlen was originally designed for th

[coreutils] [PATCH] sort, who: prefer free+malloc to realloc when contents are irrelevant

2010-08-10 Thread Paul Eggert
This change was prompted by the previous one: I audited the code looking for similar examples. Too bad valgrind doesn't catch this. * src/sort.c (check, mergefps): xrealloc -> free + xmalloc * src/who.c (print_user): Likewise. --- src/sort.c |6 -- src/who.c |9 ++--- 2 files cha

[coreutils] ignore-value.h considered harmful

2010-08-10 Thread Paul Eggert
Re that recent change to sort.c to insert a call to ignore_value. This ignore_value business is ugly, and runs against the spirit of the GNU coding standards: "Don't make the program ugly to placate lint. Please don't insert any casts to void."

Re: [coreutils] [PATCH] sort: -R now uses less memory on long lines with internal NULs

2010-08-11 Thread Paul Eggert
On 08/12/10 00:49, Pádraig Brady wrote: > Is it uint32_t for alignment (speed)? >From sort's point of view, it's uint32_t because that's what the md5 library specifies. I haven't looked into md5 and don't know if it could be sped up by assuming 64-bit integers. > Would it be worth doing a memcm

Re: [coreutils] [PATCH] sort: tune and refactor --debug code, and fix minor underlining bug

2010-08-11 Thread Paul Eggert
On 08/12/10 01:06, Pádraig Brady wrote: > The disadvantage of separating the debugging code from the > actual sorting code is that one now has to maintain the > extent matching in 2 places, which means we're less sure that > the debug output matches what's actually being done. True, but there is a

[coreutils] [PATCH] * tests/misc/sort (use-nl): Fix comment to match the test case.

2010-08-13 Thread Paul Eggert
7;, {IN=>"\0b\n\0a\n"}, {OUT=>"\0a\n\0b\n"}], # Paul Eggert wrote: -# I tested the revised `sort' against Solaris `sort', and found a -# discrepancy that turns out to be a longstanding bug in GNU sort. -# POSIX.2 specifies that a newline is part of the input l

[coreutils] Re: [Bug-tar] [PATCH] improved sparse file detection

2010-08-24 Thread Paul Eggert
aught that corner case to recognize an entirely sparse file >> as a single hole. That's a good suggestion; I'll look into that. Here's the patch I installed. Here I used "diff -b" to avoid unimportant indentation changes. From: Paul Eggert Date: Tue, 24 Aug 2010 1

[coreutils] [PATCH] sort: destroy spin locks portably

2010-09-20 Thread Paul Eggert
* src/sort.c (sortlines, sort): Use pthread_spin_destroy when a spin lock is no longer used. This isn't needed on GNU/Linux or Solaris, but POSIX says it may free up resources on some platforms. --- src/sort.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/src/sort.c b

[coreutils] How do you copy 60 million files?

2010-09-24 Thread Paul Eggert
OK, OK, so I shouldn't waste time reading The Register, but I do, so I can't resist sharing a pointer to this story: Pott T. How do you copy 60m files? The Register (2010-09-24) Basically, Pott's problem was that he had to copy 60 mil

bug#7241: Possible bug on split ?

2010-10-18 Thread Paul Eggert
On 10/18/10 10:19, Ulf Zibis wrote: > IMO this is a bug, or should be documented more explicit. I'd say fix the doc. Do you have a suggestion for improving the wording?

[coreutils] [PATCH] du: don't print junk when diagnosing out-of-range time stamps

2010-10-23 Thread Paul Eggert
t's safe to get back into the water after the recent coreutils release. >From afb834402d639936977bed7db35cda48b609e46f Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 23 Oct 2010 11:54:55 -0700 Subject: [PATCH] du: don't print junk when diagnosing out-of-range time stamps * src/d

[coreutils] Re: [PATCH] du: don't print junk when diagnosing out-of-range time stamps

2010-10-23 Thread Paul Eggert
oseconds. >From ff50010c65f0eb4ad4fade6b65ce8f349b13e31f Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 23 Oct 2010 17:20:01 -0700 Subject: [PATCH] du: don't print junk when diagnosing out-of-range time stamps * src/du.c (show_date): Fix call to fputs with a buffer that contains some uni

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-26 Thread Paul Eggert
Thanks for the bug report. Unfortunately, I cannot reproduce the problem with coreutils 8.7, either on RHEL 5.5 x86-64 or on Ubuntu 10.10 x86. Which version of coreutils are you running? And on what platform? How did you build it? Can you reproduce it with --parallel=2? If not, which value of

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-27 Thread Paul Eggert
On 11/26/2010 06:52 PM, Pádraig Brady wrote: > Hmm, seems like multiple threads are racing to update the > static "saved" variable in write_unique() ? I don't think it's as simple as that. write_unique is generating output, and when it is run it is supposed to have exclusive access to the output

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-27 Thread Paul Eggert
Following up on my previous email, it appears to me that the following line in mergelines_node is weird: node->dest -= lo_orig - node->lo + hi_orig - node->hi; Surely there should be a "*" in front of that line? (This does not fix the bug; perhaps it is a different bug?)

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-27 Thread Paul Eggert
Could you please try this little patch? It should fix your problem. I came up with this fix in my sleep (literally! I woke up this morning and the patch was in my head), but haven't had time to look at the code in this area to see if it's the best fix. Clearly there's at least one more bug as no

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-29 Thread Paul Eggert
On 11/28/10 23:14, DJ Lucas wrote: > http://lists.gnu.org/archive/html/coreutils/2010-11/msg00124.html Ah, sorry, I didn't understand that message and thought Pádraig had handled it. On an 8-core RHEL 5.5 x86-64 host I reproduced the problem with the stated test case: (for i in $(seq 12); do r

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-29 Thread Paul Eggert
On 11/29/10 12:14, Jim Meyering wrote: > I haven't tried to trigger that one yet. > Have you? No, sorry, haven't had time. My current guess, by the way, is that it's not a bug that can be triggered: it's merely useless code that is harmless and can safely be removed. (This is a guess that I also

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-29 Thread Paul Eggert
On 11/29/10 16:34, Chen Guo wrote: > The only way this would work is if, when a struct is locked via mutex the only > threads trying to acquire the struct are trying to do so via mutex, > and no threads > are looking to lock via spinlock. Yes, that's definitely the idea. Under either of my propos

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-11-30 Thread Paul Eggert
On 11/30/10 13:41, Jim Meyering wrote: > Is there anything you'd like to add? No, thanks, that looks good. I have some other patches to clean things up in this area, but they can wait. I hate to tease, so here is a draft of the cleanup patches. Most of this stuff is cleanup, but the first line of

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-01 Thread Paul Eggert
On 11/30/2010 04:19 PM, Chen Guo wrote: > could you detail how you can trigger the divide-by-zero bug? Invoke MAX_MERGE(total, level) with level == 15. 2 << level yields 65536, and 65536 * 65536 overflows to zero.

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-01 Thread Paul Eggert
On 11/29/2010 08:32 PM, Chen Guo wrote: > Hi guys, > Is something up with Savannah? I just tried a git clone and got > connection time out; I cant even reach git.sv.gnu.org via ping. There was a breakin, which led to leaking of encrypted account passwords, some of them discovered via a brute-f

[coreutils] [PATCH] sort: fix bug on 64-bit hosts with at least 32768 processors

2010-12-01 Thread Paul Eggert
On 11/30/2010 10:16 PM, Paul Eggert wrote: > Invoke MAX_MERGE(total, level) with level == 15. > 2 << level yields 65536, and 65536 * 65536 overflows to zero. I managed to reproduce this bug on a (faked) host with 32768 processors, using a command like this: seq 10 | sor

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-02 Thread Paul Eggert
On 12/02/10 02:22, Chen Guo wrote: > On Mon, Nov 29, 2010 at 11:16 AM, Paul Eggert wrote: >> (for i in $(seq 12); do read line; echo $i; sleep .1; done >> cat > /dev/null) < fifo & >> (ulimit -t 1; ./sort in > fifo \ >> || echo killed via $(env kill

[coreutils] [PATCH] tests: cleanup rm -rf fails under NFS

2010-12-03 Thread Paul Eggert
(I pushed this.) This problem was observed on RHEL 5.5 x86-64 when running as a client of a NetApp FAS2050. * tests/cp/cp-mv-backup: Don't leave a file descriptor open to a file in a directory that will be cleaned up with "rm -rf". Under NFS, when the rm unlinks that file, it is instead renamed to

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-03 Thread Paul Eggert
On 12/03/10 12:18, Chen Guo wrote: > I'll try out Professor Eggert's suggestion, of switching to mutexes > only at the top level merge. I'm having second thoughts about that. Yes, that'll prevent the top-level merge (which is generating the actual output) from chewing up CPU time. But it already

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-04 Thread Paul Eggert
On 11/29/2010 02:46 PM, Paul Eggert wrote: > My current guess, by the way, > is that it's not a bug that can be triggered: it's merely > useless code that is harmless and can safely be removed. I removed it as part of the following series of cleanup patches. These are intended

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-05 Thread Paul Eggert
On 12/05/2010 09:16 PM, Chen Guo wrote: > Before saying anything else, I should note that for mutexes, on 4 > threads 20% of the time there's a segfault on a seemingly innocuous > line in queue_insert (): > node->queued = true It does sound like mutexes are the way to go, and that this bug needs

Re: bug#7489: [coreutils] over aggressive threads in sort

2010-12-06 Thread Paul Eggert
On 12/05/10 03:21, Jim Meyering wrote: > seq -w 20 > exp && tac exp > in > PATH=.:$PATH ./sort --compress-program=dzip -S 1k in > out > > That gets stuck in waitpid (from sort.c's reap), waiting for a > dzip invocation that appears will never terminate. This is also > on that same 4-core

[coreutils] Re: multi-threaded sort can segfault (unrelated to the sort -u segfault)

2010-12-10 Thread Paul Eggert
On 12/09/10 03:31, Jim Meyering wrote: > The segfault (and other strangeness we've witnessed) > arises because each "node" struct is stored on the stack, > and its address ends up being used by another thread after > the thread that owns the stack in question has been "joined". Ah, of *course*! >

[coreutils] Re: bug#7597: multi-threaded sort can segfault (unrelated to the sort -u segfault)

2010-12-11 Thread Paul Eggert
ndex d52f677..b573061 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -656,7 +656,4 @@ pr_data = \ pr/ttb3-FF \ pr/w72l24f-ll -XFAIL_TESTS = \ - misc/sort-spinlock-abuse - include $(srcdir)/check.mk -- 1.7.2

[coreutils] draft [PATCH] sort: explicit --parallel=N now overrides environment

2010-12-11 Thread Paul Eggert
explicit --parallel=N flag? Something like the following, say? This would let the user override the environment in the command line, which is normally what people would expect. >From bbc60da9222e38bb7983464cec35c42ad41f2eb8 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 11 Dec 2010 01:

[coreutils] Re: bug#7597: multi-threaded sort can segfault (unrelated to the sort -u segfault)

2010-12-11 Thread Paul Eggert
Sorry for botching the NEWS and the change log. To help make amends, how about if I add a test case for that? I'm thinking of the 2nd test case in , namely this one: gensort -a 1 > gensort-10k for i in $(seq 2000); do prin

[coreutils] Re: bug#7597: multi-threaded sort can segfault (unrelated to the sort -u segfault)

2010-12-12 Thread Paul Eggert
lable, but falls back on seq+shuf if not. >From 63d1b425976ccc0b89159d743e33eb5da634de3c Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sun, 12 Dec 2010 13:38:19 -0800 Subject: [PATCH] tests: test for access to stale thread memory * tests/misc/sort-stale-thread-mem: New tests. * tests/

[coreutils] Re: bug#7597: multi-threaded sort can segfault (unrelated to the sort -u segfault)

2010-12-13 Thread Paul Eggert
My recent patch had a typo in a comment, which I fixed as follows: >From 7e9599422e85be01dfceecf1f38ff2c2952a3f61 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 13 Dec 2010 10:02:06 -0800 Subject: [PATCH] tests: typo fix * tests/misc/sort-stale-thread-mem: Fix typo in comment. --- te

[coreutils] [PATCH] sort: fix some --compress reaper bugs

2010-12-13 Thread Paul Eggert
I found and fixed some bugs and simplified some code while trying (and so far, failing) to fix the hang reported at the end of . I installed the following, and plan to look into that hang some more. * src/sort.c (uintptr): New

  1   2   3   >