Pádraig Brady wrote: > This looks good. In the attached version I've tweaked the > commit message, updated THANKS and changed the test to a+x > (which is required for one of the checker scripts). > > cheers, > Pádraig. >>From f7d704a7b17bfab3df91f57cef0c715e642f54af Mon Sep 17 00:00:00 2001 > From: Paul Eggert <egg...@cs.ucla.edu> > Date: Fri, 13 Mar 2009 15:48:30 -0700 > Subject: [PATCH] sort: handle fd exhaustion better when merging > > This is an alternative to my 9 March patch labeled "Silently lower > nmerge; don't (sometimes incorrectly) range-check" > <http://lists.gnu.org/archive/html/bug-coreutils/2009-03/msg00070.html>. > It differs by not using 'dup' to probe for extra file descriptors; > instead, it simply calls 'open' (and 'pipe') to open files and pipes, > until one of these calls fails due to file descriptor exhaustion; it > then backs off by 1, does a merge with the files that it has opened, > and then retries with the (now-smaller) number of files. > > This patch requires quite a few more changes to the source code than > the earlier patch, but it is in some sense "better" because it doesn't > need to call "dup" ahead of time in order to decide whether "open" or > "pipe" will fail. Also, it's more robust in the case where "open" or > "pipe" fails with errno==EMFILE because some system-wide limit is > exhausted. It may be useful to apply all of this patch, along with > the other patch's test case (which is somewhat different).
Thanks Pádraig, And thanks to Paul and students. I've taken Paul's advice from the log about adding the test from the other patch (and then removed that sentence from the log). I left the original patch intact for now. Of the following, Subject: [PATCH 2/4] tweak test Subject: [PATCH 3/4] tests: add another sort/nmerge test Subject: [PATCH 4/4] tests: adjust sort-continue not to fail under valgrind I expect to merge 2/4 into the original and leave 3 and 4 separate. Patch 3/4 is derived from the earlier patch. I removed the sort.c and tests/misc/sort-merge changes, but kept the new test and documentation improvements. Also, I've added the five names to THANKS and changed the log to list Alexander Nguyen's last name. I'll wait a day or so, in case there are comments. ------------------- >From e5bb3274c9f76fe47a6b2178ce1748dd7ff58b26 Mon Sep 17 00:00:00 2001 From: Paul Eggert <egg...@cs.ucla.edu> Date: Fri, 13 Mar 2009 15:48:30 -0700 Subject: [PATCH 1/4] sort: handle fd exhaustion better when merging This is an alternative to my 9 March patch labeled "Silently lower nmerge; don't (sometimes incorrectly) range-check" <http://lists.gnu.org/archive/html/bug-coreutils/2009-03/msg00070.html>. It differs by not using 'dup' to probe for extra file descriptors; instead, it simply calls 'open' (and 'pipe') to open files and pipes, until one of these calls fails due to file descriptor exhaustion; it then backs off by 1, does a merge with the files that it has opened, and then retries with the (now-smaller) number of files. This patch requires quite a few more changes to the source code than the earlier patch, but it is in some sense "better" because it doesn't need to call "dup" ahead of time in order to decide whether "open" or "pipe" will fail. Also, it's more robust in the case where "open" or "pipe" fails with errno==EMFILE because some system-wide limit is exhausted. * src/sort.c (create_temp_file): New arg SURVIVE_FD_EXHAUSTION. (stream_open): New function, containing guts of xfopen. (xfopen): Use it. (pipe_fork): Set errno on failure. (maybe_create_temp): New function, containing guts of create_temp. (create_temp): Use it. (open_temp): Distinguish failures due to file descriptor exhaustion from other failures, and on fd exhaustion return a notice to caller rather than dying. Don't test execlp's return value; when it returns, it *always* returns -1. (open_input_files): New function. (mergefps): New arg FPS. It's now the caller's responsibility to open the input and output files. All callers changed. (mergefiles): New function. (avoid_trashing_input, merge): Handle the case where a single merge can't merge as much as we wanted due to file descriptor exhaustion, by merging as much as we can and then retrying. * tests/Makefile.am (TESTS): Add misc/sort-continue. * tests/misc/sort-continue: New file. * THANKS: Add Glen Lenker and Matt Pham who coauthored this patch. --- THANKS | 2 + src/sort.c | 292 ++++++++++++++++++++++++++++++++++------------ tests/Makefile.am | 1 + tests/misc/sort-continue | 47 ++++++++ 4 files changed, 268 insertions(+), 74 deletions(-) create mode 100755 tests/misc/sort-continue diff --git a/THANKS b/THANKS index f894e1d..3b933ff 100644 --- a/THANKS +++ b/THANKS @@ -204,6 +204,7 @@ Geoff Whale geo...@cse.unsw.edu.au Gerald Pfeifer ger...@pfeifer.com Gerhard Poul gp...@gnu.org Germano Leichsenring germ...@jedi.cs.kobe-u.ac.jp +Glen Lenker glen.len...@gmail.com Göran Uddeborg goe...@uddeborg.se Guochun Shi g...@ncsa.uiuc.edu GOTO Masanori go...@debian.or.jp @@ -365,6 +366,7 @@ Mate Wierdl m...@moni.msci.memphis.edu Matej Vela mv...@public.srce.hr Matt Kraai kr...@ftbfs.org Matt Perry m...@primefactor.com +Matt Pham mattvp...@gmail.com Matt Schalit mscha...@pacbell.net Matt Swift sw...@alum.mit.edu Matthew Arnison maf...@cat.org.au diff --git a/src/sort.c b/src/sort.c index 7b0b064..ced0f2d 100644 --- a/src/sort.c +++ b/src/sort.c @@ -732,10 +732,13 @@ exit_cleanup (void) } /* Create a new temporary file, returning its newly allocated tempnode. - Store into *PFD the file descriptor open for writing. */ + Store into *PFD the file descriptor open for writing. + If the creation fails, return NULL and store -1 into *PFD if the + failure is due to file descriptor exhaustion and + SURVIVE_FD_EXHAUSTION; otherwise, die. */ static struct tempnode * -create_temp_file (int *pfd) +create_temp_file (int *pfd, bool survive_fd_exhaustion) { static char const slashbase[] = "/sortXXXXXX"; static size_t temp_dir_index; @@ -768,8 +771,13 @@ create_temp_file (int *pfd) errno = saved_errno; if (fd < 0) - error (SORT_FAILURE, errno, _("cannot create temporary file in %s"), - quote (temp_dir)); + { + if (! (survive_fd_exhaustion && errno == EMFILE)) + error (SORT_FAILURE, errno, _("cannot create temporary file in %s"), + quote (temp_dir)); + free (node); + node = NULL; + } *pfd = fd; return node; @@ -779,27 +787,30 @@ create_temp_file (int *pfd) standard output; HOW should be "w". When opening for input, "-" means standard input. To avoid confusion, do not return file descriptors STDIN_FILENO, STDOUT_FILENO, or STDERR_FILENO when - opening an ordinary FILE. */ + opening an ordinary FILE. Return NULL if unsuccessful. */ static FILE * -xfopen (const char *file, const char *how) +stream_open (const char *file, const char *how) { - FILE *fp; - if (!file) - fp = stdout; - else if (STREQ (file, "-") && *how == 'r') + return stdout; + if (STREQ (file, "-") && *how == 'r') { have_read_stdin = true; - fp = stdin; - } - else - { - fp = fopen (file, how); - if (! fp) - die (_("open failed"), file); + return stdin; } + return fopen (file, how); +} + +/* Same as stream_open, except always return a non-null value; die on + failure. */ +static FILE * +xfopen (const char *file, const char *how) + { + FILE *fp = stream_open (file, how); + if (!fp) + die (_("open failed"), file); return fp; } @@ -838,7 +849,8 @@ dup2_or_die (int oldfd, int newfd) /* Fork a child process for piping to and do common cleanup. The TRIES parameter tells us how many times to try to fork before - giving up. Return the PID of the child or -1 if fork failed. */ + giving up. Return the PID of the child, or -1 (setting errno) + on failure. */ static pid_t pipe_fork (int pipefds[2], size_t tries) @@ -881,8 +893,10 @@ pipe_fork (int pipefds[2], size_t tries) if (pid < 0) { + saved_errno = errno; close (pipefds[0]); close (pipefds[1]); + errno = saved_errno; } else if (pid == 0) { @@ -900,15 +914,22 @@ pipe_fork (int pipefds[2], size_t tries) } /* Create a temporary file and start a compression program to filter output - to that file. Set *PFP to the file handle and if *PPID is non-NULL, - set it to the PID of the newly-created process. */ + to that file. Set *PFP to the file handle and if PPID is non-NULL, + set *PPID to the PID of the newly-created process. If the creation + fails, return NULL if the failure is due to file descriptor + exhaustion and SURVIVE_FD_EXHAUSTION; otherwise, die. */ static char * -create_temp (FILE **pfp, pid_t *ppid) +maybe_create_temp (FILE **pfp, pid_t *ppid, bool survive_fd_exhaustion) { int tempfd; - struct tempnode *node = create_temp_file (&tempfd); - char *name = node->name; + struct tempnode *node = create_temp_file (&tempfd, survive_fd_exhaustion); + char *name; + + if (! node) + return NULL; + + name = node->name; if (compress_program) { @@ -949,48 +970,68 @@ create_temp (FILE **pfp, pid_t *ppid) return name; } +/* Create a temporary file and start a compression program to filter output + to that file. Set *PFP to the file handle and if *PPID is non-NULL, + set it to the PID of the newly-created process. Die on failure. */ + +static char * +create_temp (FILE **pfp, pid_t *ppid) +{ + return maybe_create_temp (pfp, ppid, false); +} + /* Open a compressed temp file and start a decompression process through which to filter the input. PID must be the valid processes ID of the - process used to compress the file. */ + process used to compress the file. Return NULL (setting errno to + EMFILE) if we ran out of file descriptors, and die on any other + kind of failure. */ static FILE * open_temp (const char *name, pid_t pid) { int tempfd, pipefds[2]; - pid_t child_pid; - FILE *fp; + FILE *fp = NULL; wait_proc (pid); tempfd = open (name, O_RDONLY); if (tempfd < 0) - die (_("couldn't open temporary file"), name); + return NULL; - child_pid = pipe_fork (pipefds, MAX_FORK_TRIES_DECOMPRESS); - if (0 < child_pid) + switch (pipe_fork (pipefds, MAX_FORK_TRIES_DECOMPRESS)) { + case -1: + if (errno != EMFILE) + error (SORT_FAILURE, errno, _("couldn't create process for %s -d"), + compress_program); close (tempfd); - close (pipefds[1]); - } - else if (child_pid == 0) - { + errno = EMFILE; + break; + + case 0: close (pipefds[0]); dup2_or_die (tempfd, STDIN_FILENO); close (tempfd); dup2_or_die (pipefds[1], STDOUT_FILENO); close (pipefds[1]); - if (execlp (compress_program, compress_program, "-d", (char *) NULL) < 0) - error (SORT_FAILURE, errno, _("couldn't execute %s -d"), - compress_program); - } - else - error (SORT_FAILURE, errno, _("couldn't create process for %s -d"), - compress_program); + execlp (compress_program, compress_program, "-d", (char *) NULL); + error (SORT_FAILURE, errno, _("couldn't execute %s -d"), + compress_program); - fp = fdopen (pipefds[0], "r"); - if (! fp) - die (_("couldn't create temporary file"), name); + default: + close (tempfd); + close (pipefds[1]); + + fp = fdopen (pipefds[0], "r"); + if (! fp) + { + int saved_errno = errno; + close (pipefds[0]); + errno = saved_errno; + } + break; + } return fp; } @@ -2148,31 +2189,53 @@ check (char const *file_name, char checkonly) return ordered; } +/* Open FILES (there are NFILES of them) and store the resulting array + of stream pointers into (*PFPS). Allocate the array. Return the + number of successfully opened files, setting errno if this value is + less than NFILES. */ + +static size_t +open_input_files (struct sortfile *files, size_t nfiles, FILE ***pfps) +{ + FILE **fps = *pfps = xnmalloc (nfiles, sizeof *fps); + int i; + + /* Open as many input files as we can. */ + for (i = 0; i < nfiles; i++) + { + fps[i] = (files[i].pid + ? open_temp (files[i].name, files[i].pid) + : stream_open (files[i].name, "r")); + if (!fps[i]) + break; + } + + return i; +} + /* Merge lines from FILES onto OFP. NTEMPS is the number of temporary files (all of which are at the start of the FILES array), and NFILES is the number of files; 0 <= NTEMPS <= NFILES <= NMERGE. - Close input and output files before returning. + FPS is the vector of open stream corresponding to the files. + Close input and output streams before returning. OUTPUT_FILE gives the name of the output file. If it is NULL, - the output file is standard output. If OFP is NULL, the output - file has not been opened yet (or written to, if standard output). */ + the output file is standard output. */ static void mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, - FILE *ofp, char const *output_file) + FILE *ofp, char const *output_file, FILE **fps) { - FILE **fps = xnmalloc (nmerge, sizeof *fps); - /* Input streams for each file. */ - struct buffer *buffer = xnmalloc (nmerge, sizeof *buffer); + struct buffer *buffer = xnmalloc (nfiles, sizeof *buffer); /* Input buffers for each file. */ struct line saved; /* Saved line storage for unique check. */ struct line const *savedline = NULL; /* &saved if there is a saved line. */ size_t savealloc = 0; /* Size allocated for the saved line. */ - struct line const **cur = xnmalloc (nmerge, sizeof *cur); + struct line const **cur = xnmalloc (nfiles, sizeof *cur); /* Current line in each line table. */ - struct line const **base = xnmalloc (nmerge, sizeof *base); + struct line const **base = xnmalloc (nfiles, sizeof *base); /* Base of each line table. */ - size_t *ord = xnmalloc (nmerge, sizeof *ord); + size_t *ord = xnmalloc (nfiles, sizeof *ord); /* Table representing a permutation of fps, such that cur[ord[0]] is the smallest line and will be next output. */ @@ -2185,9 +2248,6 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, /* Read initial lines from each input file. */ for (i = 0; i < nfiles; ) { - fps[i] = (files[i].pid - ? open_temp (files[i].name, files[i].pid) - : xfopen (files[i].name, "r")); initbuf (&buffer[i], sizeof (struct line), MAX (merge_buffer_size, sort_size / nfiles)); if (fillbuf (&buffer[i], fps[i], files[i].name)) @@ -2209,13 +2269,13 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, free (buffer[i].buf); --nfiles; for (j = i; j < nfiles; ++j) - files[j] = files[j + 1]; + { + files[j] = files[j + 1]; + fps[j] = fps[j + 1]; + } } } - if (! ofp) - ofp = xfopen (output_file, "w"); - /* Set up the ord table according to comparisons among input lines. Since this only reorders two items if one is strictly greater than the other, it is stable. */ @@ -2353,6 +2413,28 @@ mergefps (struct sortfile *files, size_t ntemps, size_t nfiles, free(cur); } +/* Merge lines from FILES onto OFP. NTEMPS is the number of temporary + files (all of which are at the start of the FILES array), and + NFILES is the number of files; 0 <= NTEMPS <= NFILES <= NMERGE. + Close input and output files before returning. + OUTPUT_FILE gives the name of the output file. + + Return the number of files successfully merged. This number can be + less than NFILES if we ran low on file descriptors, but in this + case it is never less than 2. */ + +static size_t +mergefiles (struct sortfile *files, size_t ntemps, size_t nfiles, + FILE *ofp, char const *output_file) +{ + FILE **fps; + size_t nopened = open_input_files (files, nfiles, &fps); + if (nopened < nfiles && nopened < 2) + die (_("open failed"), files[nopened].name); + mergefps (files, ntemps, nopened, ofp, output_file, fps); + return nopened; +} + /* Merge into T the two sorted arrays of lines LO (with NLO members) and HI (with NHI members). T, LO, and HI point just past their respective arrays, and the arrays are in reverse order. NLO and @@ -2519,10 +2601,19 @@ avoid_trashing_input (struct sortfile *files, size_t ntemps, FILE *tftp; pid_t pid; char *temp = create_temp (&tftp, &pid); - mergefps (&files[i],0, nfiles - i, tftp, temp); - files[i].name = temp; - files[i].pid = pid; - return i + 1; + size_t num_merged = 0; + while (i + num_merged < nfiles) + { + num_merged += mergefiles (&files[i], 0, nfiles - i, tftp, temp); + files[i].name = temp; + files[i].pid = pid; + + memmove(&files[i], &files[i + num_merged], + num_merged * sizeof *files); + ntemps += 1; + nfiles -= num_merged - 1;; + i += num_merged; + } } } @@ -2553,17 +2644,20 @@ merge (struct sortfile *files, size_t ntemps, size_t nfiles, /* Number of easily-available slots at the next loop iteration. */ size_t cheap_slots; - /* Do as many NMERGE-size merges as possible. */ - for (out = in = 0; out < nfiles / nmerge; out++, in += nmerge) + /* Do as many NMERGE-size merges as possible. In the case that + nmerge is bogus, increment by the maximum number of file + descriptors allowed. */ + for (out = in = 0; nmerge <= nfiles - in; out++) { FILE *tfp; pid_t pid; char *temp = create_temp (&tfp, &pid); - size_t nt = MIN (ntemps, nmerge); - ntemps -= nt; - mergefps (&files[in], nt, nmerge, tfp, temp); + size_t num_merged = mergefiles (&files[in], MIN (ntemps, nmerge), + nmerge, tfp, temp); + ntemps -= MIN (ntemps, num_merged); files[out].name = temp; files[out].pid = pid; + in += num_merged; } remainder = nfiles - in; @@ -2578,12 +2672,12 @@ merge (struct sortfile *files, size_t ntemps, size_t nfiles, FILE *tfp; pid_t pid; char *temp = create_temp (&tfp, &pid); - size_t nt = MIN (ntemps, nshortmerge); - ntemps -= nt; - mergefps (&files[in], nt, nshortmerge, tfp, temp); + size_t num_merged = mergefiles (&files[in], MIN (ntemps, nshortmerge), + nshortmerge, tfp, temp); + ntemps -= MIN (ntemps, num_merged); files[out].name = temp; files[out++].pid = pid; - in += nshortmerge; + in += num_merged; } /* Put the remaining input files into the last NMERGE-sized output @@ -2594,7 +2688,57 @@ merge (struct sortfile *files, size_t ntemps, size_t nfiles, } nfiles = avoid_trashing_input (files, ntemps, nfiles, output_file); - mergefps (files, ntemps, nfiles, NULL, output_file); + + /* We aren't guaranteed that this final mergefiles will work, therefore we + try to merge into the output, and then merge as much as we can into a + temp file if we can't. Repeat. */ + + for (;;) + { + /* Merge directly into the output file if possible. */ + FILE **fps; + size_t nopened = open_input_files (files, nfiles, &fps); + + if (nopened == nfiles) + { + FILE *ofp = stream_open (output_file, "w"); + if (ofp) + { + mergefps (files, ntemps, nfiles, ofp, output_file, fps); + break; + } + if (errno != EMFILE || nopened <= 2) + die (_("open failed"), output_file); + } + else if (nopened <= 2) + die (_("open failed"), files[nopened].name); + + /* We ran out of file descriptors. Close one of the input + files, to gain a file descriptor. Then create a temporary + file with our spare file descriptor. Retry if that failed + (e.g., some other process could open a file between the time + we closed and tried to create). */ + FILE *tfp; + pid_t pid; + char *temp; + do + { + nopened--; + xfclose (fps[nopened], files[nopened].name); + temp = maybe_create_temp (&tfp, &pid, ! (nopened <= 2)); + } + while (!temp); + + /* Merge into the newly allocated temporary. */ + mergefps (&files[0], MIN (ntemps, nopened), nopened, tfp, temp, fps); + ntemps -= MIN (ntemps, nopened); + files[0].name = temp; + files[0].pid = pid; + + memmove (&files[1], &files[nopened], (nfiles - nopened) * sizeof *files); + ntemps++; + nfiles -= nopened - 1; + } } /* Sort NFILES FILES onto OUTPUT_FILE. */ diff --git a/tests/Makefile.am b/tests/Makefile.am index 10be0c6..2fb01c4 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -201,6 +201,7 @@ TESTS = \ misc/shuf \ misc/sort \ misc/sort-compress \ + misc/sort-continue \ misc/sort-files0-from \ misc/sort-merge \ misc/sort-rand \ diff --git a/tests/misc/sort-continue b/tests/misc/sort-continue new file mode 100755 index 0000000..b94cb2a --- /dev/null +++ b/tests/misc/sort-continue @@ -0,0 +1,47 @@ +#!/bin/bash +# Tests for file descriptor exhaustion. + +# Copyright (C) 2009 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +if test "$VERBOSE" = yes; then + set -x + sort --version +fi + +. $srcdir/test-lib.sh + +for i in `seq 31`; do + echo $i | tee -a in > __test.$i || framework_failure +done + +fail=0 + +( + ulimit -n 6 + exec 0</dev/null 3<&- 4<&- 5<&- + sort -n -m __test.* > out +) && +compare in out || { fail=1; echo "file descriptor exhaustion not handled" 1>&2; } + +echo "32" | tee -a in > in1 +( + ulimit -n 6 + exec 3<&- 4<&- 5<&- + cat in1 | sort -n -m __test.* - > out +) && +compare in out || { fail=1; echo "stdin not handled properly" 1>&2; } + +Exit $fail -- 1.6.2.rc1.285.gc5f54 >From dbc8690bc5cbba66737a8c88b4a643f3e6645948 Mon Sep 17 00:00:00 2001 From: Jim Meyering <meyer...@redhat.com> Date: Tue, 17 Mar 2009 18:17:09 +0100 Subject: [PATCH 2/4] tweak test --- tests/misc/sort-continue | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tests/misc/sort-continue b/tests/misc/sort-continue index b94cb2a..b1031fa 100755 --- a/tests/misc/sort-continue +++ b/tests/misc/sort-continue @@ -23,7 +23,7 @@ fi . $srcdir/test-lib.sh -for i in `seq 31`; do +for i in $(seq 31); do echo $i | tee -a in > __test.$i || framework_failure done @@ -34,14 +34,14 @@ fail=0 exec 0</dev/null 3<&- 4<&- 5<&- sort -n -m __test.* > out ) && -compare in out || { fail=1; echo "file descriptor exhaustion not handled" 1>&2; } +compare in out || { fail=1; echo 'file descriptor exhaustion not handled' 1>&2; } -echo "32" | tee -a in > in1 +echo 32 | tee -a in > in1 ( ulimit -n 6 exec 3<&- 4<&- 5<&- cat in1 | sort -n -m __test.* - > out ) && -compare in out || { fail=1; echo "stdin not handled properly" 1>&2; } +compare in out || { fail=1; echo 'stdin not handled properly' 1>&2; } Exit $fail -- 1.6.2.rc1.285.gc5f54 >From 526a803a4cebfe097a517a91fdc7ac40cd2b0168 Mon Sep 17 00:00:00 2001 From: Paul Eggert <egg...@cs.ucla.edu> Date: Mon, 9 Mar 2009 14:56:13 -0700 Subject: [PATCH 3/4] tests: add another sort/nmerge test * tests/Makefile.am (TESTS): Add sort-merge-fdlimit. * tests/misc/sort-merge-fdlimit: New file. * doc/coreutils.texi (sort invocation): Document that we now silently lower nmerge if necessary. Patch by Paul Eggert, Nima Nikzad, Max Chang, Alexander Nguyen, Sahil Amoli, and Nick Graham. --- THANKS | 5 ++++ doc/coreutils.texi | 14 +++++++---- tests/Makefile.am | 1 + tests/misc/sort-merge-fdlimit | 54 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 69 insertions(+), 5 deletions(-) create mode 100755 tests/misc/sort-merge-fdlimit diff --git a/THANKS b/THANKS index 3b933ff..d4b2d61 100644 --- a/THANKS +++ b/THANKS @@ -22,6 +22,7 @@ Albert Hopkins ahopk...@dynacare.com Alberto Accomazzi albe...@cfa0.harvard.edu aldomel aldo...@ix.netcom.com Alen Muzinic zv...@fly.cc.fer.hr +Alexander Nguyen v...@seas.ucla.edu Alexander V. Lukyanov l...@netis.ru Allen Hewes al...@decisiv.net Axel Dörfler ax...@pinc-software.de @@ -380,6 +381,7 @@ Matthew Woehlke mw_tr...@users.sourceforge.net Matthias Urlichs sm...@noris.de Matti Aarnio matti.aar...@zmailer.org Mattias Wadenstein mas...@acc.umu.se +Max Chang maxch...@ucla.edu Meelis Roos mr...@tartu.cyber.ee Michael mich...@aplatform.com Michael ??? mich...@roka.net @@ -417,11 +419,13 @@ Neal H Walfield n...@cs.uml.edu Neil Brown ne...@cse.unsw.edu.au Nelson H. F. Beebe be...@math.utah.edu Nick Estes deb...@nickstoys.com +Nick Graham nick.d.gra...@gmail.com Nick Lawes nla...@silverplatter.com Nickolai Zeldovich nicko...@cs.stanford.edu Nicolas François nicolas.franc...@centraliens.net Niklas Edmundsson ni...@acc.umu.se Nikola Milutinovic nikola.milutino...@ev.co.yu +Nima Nikzad nnik...@ucla.edu Noah Friedman fried...@splode.com Noel Cragg n...@red-bean.com Norbert Kiesel nkie...@tbdnetworks.com @@ -494,6 +498,7 @@ Ross Alexander r.alexan...@auckland.ac.nz Ross Paterson r...@doc.ic.ac.uk Ross Ridge rri...@calum.csclub.uwaterloo.ca Rudolf Kastl rka...@redhat.com +Sahil Amoli sahilam...@gmail.com Sami Farin sfa...@ratol.fi Samuel Tardieu s...@rfc1149.net Samuel Thibault samuel.thiba...@ens-lyon.org diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 6f8c197..04db676 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -3937,13 +3937,17 @@ sort invocation requirements and I/0 at the expense of temporary storage consumption and merge performance. -The value of @var{nmerge} must be at least 2. +The value of @var{nmerge} must be at least 2. The default value is +currently 16, but this is implementation-dependent and may change in +the future. The value of @var{nmerge} may be bounded by a resource limit for open -file descriptors. Try @samp{ulimit -n} or @samp{getconf OPEN_MAX} to -to display the limit for a particular system. -If the value of @var{nmerge} exceeds this limit, then @command{sort} will -issue a warning to standard error and exit with a nonzero status. +file descriptors. The commands @samp{ulimit -n} or @samp{getconf +OPEN_MAX} may display limits for your systems; these limits may be +modified further if your program already has some files open, or if +the operating system has other limits on the number of open files. If +the value of @var{nmerge} exceeds the resource limit, @command{sort} +silently uses a smaller value. @item -o @var{output-file} @itemx --outp...@var{output-file} diff --git a/tests/Makefile.am b/tests/Makefile.am index 2fb01c4..3a15a87 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -204,6 +204,7 @@ TESTS = \ misc/sort-continue \ misc/sort-files0-from \ misc/sort-merge \ + misc/sort-merge-fdlimit \ misc/sort-rand \ misc/sort-version \ misc/split-a \ diff --git a/tests/misc/sort-merge-fdlimit b/tests/misc/sort-merge-fdlimit new file mode 100755 index 0000000..82305be --- /dev/null +++ b/tests/misc/sort-merge-fdlimit @@ -0,0 +1,54 @@ +#!/bin/sh +# Test whether sort avoids opening more file descriptors than it is +# allowed when merging files. + +# Copyright (C) 2009 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +if test "$VERBOSE" = yes; then + set -x + sort --version +fi + +. $srcdir/test-lib.sh +require_ulimit_ + +mkdir in err || framework_failure + +fail=0 + +for i in `seq 17`; do + echo $i >in/$i +done + +# When these tests are run inside the automated testing framework, they +# have one less available file descriptor than when run outside the +# automated testing framework. If a test with a batch size of b fails +# inside the ATF, then the same test with batch size b+1 may pass outside +# the ATF but fail inside it. + +# The default batch size (nmerge) is 16. +(ulimit -n 19 \ + && sort -m --batch-size=16 in/* 2>err/merge-default-err \ + || ! grep "open failed" err/merge-default-err) || fail=1 + +# If sort opens a file (/dev/urandom) to sort by random hashes of keys, +# it needs to consider this file against its limit on open file +# descriptors. +(ulimit -n 20 \ + && sort -mR --batch-size=16 in/* 2>err/merge-random-err \ + || ! grep "open failed" err/merge-random-err) || fail=1 + +Exit $fail -- 1.6.2.rc1.285.gc5f54 >From 4456212bceb21e1b22c750ea6d9beabb6e82b6bc Mon Sep 17 00:00:00 2001 From: Jim Meyering <meyer...@redhat.com> Date: Tue, 17 Mar 2009 21:23:51 +0100 Subject: [PATCH 4/4] tests: adjust sort-continue not to fail under valgrind * tests/misc/sort-continue: Don't run cat inside fd-limited shell. If sort fails to run in an fd-limited shell, skip the test. --- tests/misc/sort-continue | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/tests/misc/sort-continue b/tests/misc/sort-continue index b1031fa..1b0ef43 100755 --- a/tests/misc/sort-continue +++ b/tests/misc/sort-continue @@ -23,6 +23,10 @@ fi . $srcdir/test-lib.sh +# Skip the test when running under valgrind. +( ulimit -n 6; sort < /dev/null ) \ + || skip_test_ 'fd-limited sort failed; are you running under valgrind?' + for i in $(seq 31); do echo $i | tee -a in > __test.$i || framework_failure done @@ -40,7 +44,7 @@ echo 32 | tee -a in > in1 ( ulimit -n 6 exec 3<&- 4<&- 5<&- - cat in1 | sort -n -m __test.* - > out + sort -n -m __test.* - < in1 > out ) && compare in out || { fail=1; echo 'stdin not handled properly' 1>&2; } -- 1.6.2.rc1.285.gc5f54 _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils