[coreutils-announce] coreutils-8.21 released [stable]
This is to announce coreutils-8.21, a stable release. There have been 121 commits by 18 people in the 16 weeks since 8.20. Executive summary: 8.21 is mainly a bug fix release, including fixes for recent regressions in cp, factor and seq. cut has received fixes for many long standing issues. df is updated to better handle newer systems that link /etc/mtab to /proc/mounts, and also provides a new --output option to control which fields to display. A new numfmt utility was included to provide various number formatting and conversion functions. See the NEWS below for a brief summary. Thanks to everyone who has contributed! The following people contributed changes to this release: Assaf Gordon (7): Benno Schulenberg (6): Bernhard Voelker (24): Cojocaru Alexandru (2): Colin Watson (1): Daniel Schepler (1): Jakob Truelsen (1): Jim Meyering (13): Karl Berry (2): Mike Frysinger (2): Ondrej Oprala (2): Ondřej Vašík (1): Paul Eggert (12): Pádraig Brady (46): Stefano Lattarini (1): Stephan Krempel (1): Zartaj Majeed (1): Ángel González (1): Pádraig [on behalf of the coreutils maintainers] == Here is the GNU coreutils home page: http://gnu.org/s/coreutils/ For a summary of changes and contributors, see: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=shortlog;h=v8.21 or run this command from a git-cloned coreutils directory: git shortlog v8.20..v8.21 To summarize the 173 gnulib-related changes, run these commands From a git-cloned coreutils directory: git checkout v8.21 git submodule summary v8.20 == Here are the compressed sources and a GPG detached signature[*]: http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz.sig Use a mirror for higher download bandwidth: http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz.sig [*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this: gpg --verify coreutils-8.21.tar.xz.sig If that command fails because you don't have the required public key, then run this command to import it: gpg --keyserver keys.gnupg.net --recv-keys DF6FD971306037D9 and rerun the 'gpg --verify' command. This release was bootstrapped with the following tools: Autoconf 2.68 Automake 1.11.6 Gnulib v0.0-7848-g4a82904 Bison 2.4.3 NEWS * Noteworthy changes in release 8.21 (2013-02-14) [stable] ** New programs numfmt: reformat numbers ** New features df now accepts the --output[=FIELD_LIST] option to define the list of columns to include in the output, or all available columns if the FIELD_LIST is omitted. Note this enables df to output both block and inode fields together. du now accepts the --threshold=SIZE option to restrict the output to entries with such a minimum SIZE (or a maximum SIZE if it is negative). du recognizes -t SIZE as equivalent, for compatibility with FreeBSD. ** Bug fixes cp --no-preserve=mode now no longer exits non-zero. [bug introduced in coreutils-8.20] cut with a range like N- no longer allocates N/8 bytes. That buffer would never be used, and allocation failure could cause cut to fail. [bug introduced in coreutils-8.10] cut no longer accepts the invalid range 0-, which made it print empty lines. Instead, cut now fails and emits an appropriate diagnostic. [This bug was present in the beginning.] cut now handles overlapping to-EOL ranges properly. Before, it would interpret -b2-,3- like -b3-. Now it's treated like -b2-. [This bug was present in the beginning.] cut no longer prints extraneous delimiters when a to-EOL range subsumes another range. Before, echo 123|cut --output-delim=: -b2-,3 would print 2:3. Now it prints 23. [bug introduced in 5.3.0] cut -f no longer inspects input line N+1 before fully outputting line N, which avoids delayed output for intermittent input. [bug introduced in TEXTUTILS-1_8b] factor no longer loops infinitely on 32 bit powerpc or sparc systems. [bug introduced in coreutils-8.20] install -m M SOURCE DEST no longer has a race condition where DEST's permissions are temporarily derived from SOURCE instead of from M. pr -n no longer crashes when passed values = 32. Also, line numbers are consistently padded with spaces, rather than with zeros for certain widths. [bug introduced in TEXTUTILS-1_22i] seq -w ensures that for numbers input in scientific notation, the output numbers are properly aligned and of the correct width. [This bug was present in the beginning.] seq -w ensures correct alignment when the step value includes a precision while the start value does not, and the number sequence
coreutils-8.21 released [stable]
This is to announce coreutils-8.21, a stable release. There have been 121 commits by 18 people in the 16 weeks since 8.20. Executive summary: 8.21 is mainly a bug fix release, including fixes for recent regressions in cp, factor and seq. cut has received fixes for many long standing issues. df is updated to better handle newer systems that link /etc/mtab to /proc/mounts, and also provides a new --output option to control which fields to display. A new numfmt utility was included to provide various number formatting and conversion functions. See the NEWS below for a brief summary. Thanks to everyone who has contributed! The following people contributed changes to this release: Assaf Gordon (7): Benno Schulenberg (6): Bernhard Voelker (24): Cojocaru Alexandru (2): Colin Watson (1): Daniel Schepler (1): Jakob Truelsen (1): Jim Meyering (13): Karl Berry (2): Mike Frysinger (2): Ondrej Oprala (2): Ondřej Vašík (1): Paul Eggert (12): Pádraig Brady (46): Stefano Lattarini (1): Stephan Krempel (1): Zartaj Majeed (1): Ángel González (1): Pádraig [on behalf of the coreutils maintainers] == Here is the GNU coreutils home page: http://gnu.org/s/coreutils/ For a summary of changes and contributors, see: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=shortlog;h=v8.21 or run this command from a git-cloned coreutils directory: git shortlog v8.20..v8.21 To summarize the 173 gnulib-related changes, run these commands From a git-cloned coreutils directory: git checkout v8.21 git submodule summary v8.20 == Here are the compressed sources and a GPG detached signature[*]: http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz.sig Use a mirror for higher download bandwidth: http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz.sig [*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this: gpg --verify coreutils-8.21.tar.xz.sig If that command fails because you don't have the required public key, then run this command to import it: gpg --keyserver keys.gnupg.net --recv-keys DF6FD971306037D9 and rerun the 'gpg --verify' command. This release was bootstrapped with the following tools: Autoconf 2.68 Automake 1.11.6 Gnulib v0.0-7848-g4a82904 Bison 2.4.3 NEWS * Noteworthy changes in release 8.21 (2013-02-14) [stable] ** New programs numfmt: reformat numbers ** New features df now accepts the --output[=FIELD_LIST] option to define the list of columns to include in the output, or all available columns if the FIELD_LIST is omitted. Note this enables df to output both block and inode fields together. du now accepts the --threshold=SIZE option to restrict the output to entries with such a minimum SIZE (or a maximum SIZE if it is negative). du recognizes -t SIZE as equivalent, for compatibility with FreeBSD. ** Bug fixes cp --no-preserve=mode now no longer exits non-zero. [bug introduced in coreutils-8.20] cut with a range like N- no longer allocates N/8 bytes. That buffer would never be used, and allocation failure could cause cut to fail. [bug introduced in coreutils-8.10] cut no longer accepts the invalid range 0-, which made it print empty lines. Instead, cut now fails and emits an appropriate diagnostic. [This bug was present in the beginning.] cut now handles overlapping to-EOL ranges properly. Before, it would interpret -b2-,3- like -b3-. Now it's treated like -b2-. [This bug was present in the beginning.] cut no longer prints extraneous delimiters when a to-EOL range subsumes another range. Before, echo 123|cut --output-delim=: -b2-,3 would print 2:3. Now it prints 23. [bug introduced in 5.3.0] cut -f no longer inspects input line N+1 before fully outputting line N, which avoids delayed output for intermittent input. [bug introduced in TEXTUTILS-1_8b] factor no longer loops infinitely on 32 bit powerpc or sparc systems. [bug introduced in coreutils-8.20] install -m M SOURCE DEST no longer has a race condition where DEST's permissions are temporarily derived from SOURCE instead of from M. pr -n no longer crashes when passed values = 32. Also, line numbers are consistently padded with spaces, rather than with zeros for certain widths. [bug introduced in TEXTUTILS-1_22i] seq -w ensures that for numbers input in scientific notation, the output numbers are properly aligned and of the correct width. [This bug was present in the beginning.] seq -w ensures correct alignment when the step value includes a precision while the start value does not, and the number sequence
[PATCH] join: Add -z option
Hello, This patch add -z to join, supporting joining zero-terminated lines. The patch is heavily based on James Youngman's patch of adding -z to uniq (commit e062524). -gordon P.S. This patch is independent of the key-comparison patches discussed recently, though I'm also adding it there. From 525eb72b150ed34d3bfcfe453d1494fe28a824b7 Mon Sep 17 00:00:00 2001 From: Assaf Gordon assafgor...@gmail.com Date: Thu, 14 Feb 2013 15:29:08 -0500 Subject: [PATCH] join: Add -z option * NEWS: Mention join's new option: --zero-terminated (-z). * src/join.c: Add new option, --zero-terminated (-z), to make join use the NUL byte as separator/delimiter rather than newline. (get_line): Use readlinebuffer_delim in place of readlinebuffer. (main): Handle the new option. (usage): Describe new option the same way sort does. * doc/coreutils.texi (join invocation): Describe the new option. * tests/misc/join.pl: add tests for -z option. --- NEWS |6 ++ doc/coreutils.texi | 17 + src/join.c | 19 +++ tests/misc/join.pl | 20 4 files changed, 58 insertions(+), 4 deletions(-) diff --git a/NEWS b/NEWS index 37bcdf7..618c1da 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,12 @@ GNU coreutils NEWS-*- outline -*- * Noteworthy changes in release ?.? (-??-??) [?] +** New features + + join accepts a new option: --zero-terminated (-z). As with the sort,uniq + option of the same name, this makes join consume and produce NUL-terminated + lines rather than newline-terminated lines. + * Noteworthy changes in release 8.21 (2013-02-14) [stable] diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 2c16dc4..a72d9ce 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -6059,6 +6059,10 @@ available; the sort order can be any order that considers two fields to be equal if and only if the sort comparison described above considers them to be equal. For example: +Input and output lines are terminated with a newline character unless the +@option{--zero-terminated} (@option{-z}) is used, in which case lines are +@sc{nul} terminated. + @example $ cat file1 a a1 @@ -6181,6 +6185,19 @@ character is used to delimit the fields. Print a line for each unpairable line in file @var{file-number} (either @samp{1} or @samp{2}), instead of the normal output. +@item -z +@itemx --zero-terminated +@opindex -z +@opindex --zero-terminated +@cindex join zero-terminated lines +Treat the input as a set of lines, each terminated by a null character +(ASCII @sc{nul}) instead of a line feed +(ASCII @sc{lf}). +This option can be useful in conjunction with @samp{sort -z}, @samp{uniq -z}, +@samp{perl -0} or @samp{find -print0} and @samp{xargs -0} which do the same in +order to reliably handle arbitrary file names (even those containing blanks +or other special characters). + @end table @exitstatus diff --git a/src/join.c b/src/join.c index 11e647c..1810ac2 100644 --- a/src/join.c +++ b/src/join.c @@ -161,6 +161,7 @@ static struct option const longopts[] = {ignore-case, no_argument, NULL, 'i'}, {check-order, no_argument, NULL, CHECK_ORDER_OPTION}, {nocheck-order, no_argument, NULL, NOCHECK_ORDER_OPTION}, + {zero-terminated, no_argument, NULL, 'z'}, {header, no_argument, NULL, HEADER_LINE_OPTION}, {GETOPT_HELP_OPTION_DECL}, {GETOPT_VERSION_OPTION_DECL}, @@ -177,6 +178,9 @@ static bool ignore_case; join them without checking for ordering */ static bool join_header_lines; +/* The character marking end of line. Default to \n. */ +static char eolchar = '\n'; + void usage (int status) { @@ -213,6 +217,9 @@ by whitespace. When FILE1 or FILE2 (not both) is -, read standard input.\n\ --header treat the first line in each file as field headers,\n\ print them without trying to pair them\n\ ), stdout); + fputs (_(\ + -z, --zero-terminated end lines with 0 byte, not newline\n\ +), stdout); fputs (HELP_OPTION_DESCRIPTION, stdout); fputs (VERSION_OPTION_DESCRIPTION, stdout); fputs (_(\ @@ -445,7 +452,7 @@ get_line (FILE *fp, struct line **linep, int which) else line = init_linep (linep); - if (! readlinebuffer (line-buf, fp)) + if (! readlinebuffer_delim (line-buf, fp, eolchar)) { if (ferror (fp)) error (EXIT_FAILURE, errno, _(read error)); @@ -614,7 +621,7 @@ prjoin (struct line const *line1, struct line const *line2) break; putchar (output_separator); } - putchar ('\n'); + putchar (eolchar); } else { @@ -636,7 +643,7 @@ prjoin (struct line const *line1, struct line const *line2) prfields (line1, join_field_1, autocount_1); prfields (line2, join_field_2, autocount_2); - putchar ('\n'); + putchar (eolchar); } } @@ -1017,7 +1024,7 @@ main (int argc, char **argv) issued_disorder_warning[0] =
sort/uniq/join: key-comparison code consolidation
Hello, ( new thread for previous topic http://lists.gnu.org/archive/html/coreutils/2013-02/msg00082.html ) . The attached patch contains: 1. src/key-spec-parsing.{h,c} - key comparison code, previously in sort.c 2. uniq - now supports --key (multiple keys, too). Same as before, but rebased against 8.21. Supported orders: -k1,1 = ascii -k1b,1 = ignore-blanks -k1d,1 = dictionary -k1i,1 = non-printing -k1f,1 = ignore-case -k1n,1 = fast-numeric -k1g,1 = general-numeric -k1M,1 = month also supports user-specified delimiter (default: white-space). Related discussions: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=5832 http://debbugs.gnu.org/cgi/bugreport.cgi?bug=7068 http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00211.html 3. sort - same functionality as before, but key-comparison code extracted to a different file. 4. join - internally uses the key-comparison code. Does not support the --key parameter (uses the standard -j/-1/-2), but accepts new arguments that affect joining order: -r --reverse -n --numeric-sort -d --dictionary-order -g --general-numeric Related discussions: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6903 http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6366 As an option, perhaps we can support new -k that will be like -j but allow specificity options (e.g. -k1nr will be equivalent to -j 1 --numeric --reverse). It'll be easy to add human-numeric-sort/version-sort to join/uniq, but I'm not sure if they make sense. Regards, -gordon key_compare7.patch.xz Description: application/xz
Re: [PATCH] join: Add -z option
On 02/14/2013 08:51 PM, Assaf Gordon wrote: Hello, This patch add -z to join, supporting joining zero-terminated lines. The patch is heavily based on James Youngman's patch of adding -z to uniq (commit e062524). -gordon P.S. This patch is independent of the key-comparison patches discussed recently, though I'm also adding it there. This make sense under the general theme of consolidating sort,uniq,join options. thanks, Pádraig.