[coreutils-announce] coreutils-8.21 released [stable]

2013-02-14 Thread Pádraig Brady

This is to announce coreutils-8.21, a stable release.

There have been 121 commits by 18 people in the 16 weeks since 8.20.

Executive summary: 8.21 is mainly a bug fix release, including fixes
for recent regressions in cp, factor and seq.  cut has received fixes for
many long standing issues.  df is updated to better handle newer systems
that link /etc/mtab to /proc/mounts, and also provides a new --output
option to control which fields to display.  A new numfmt utility was
included to provide various number formatting and conversion functions.

See the NEWS below for a brief summary.

Thanks to everyone who has contributed!
The following people contributed changes to this release:

  Assaf Gordon (7):
  Benno Schulenberg (6):
  Bernhard Voelker (24):
  Cojocaru Alexandru (2):
  Colin Watson (1):
  Daniel Schepler (1):
  Jakob Truelsen (1):
  Jim Meyering (13):
  Karl Berry (2):
  Mike Frysinger (2):
  Ondrej Oprala (2):
  Ondřej Vašík (1):
  Paul Eggert (12):
  Pádraig Brady (46):
  Stefano Lattarini (1):
  Stephan Krempel (1):
  Zartaj Majeed (1):
  Ángel González (1):

Pádraig [on behalf of the coreutils maintainers]

==

Here is the GNU coreutils home page:
http://gnu.org/s/coreutils/

For a summary of changes and contributors, see:
  http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=shortlog;h=v8.21
or run this command from a git-cloned coreutils directory:
  git shortlog v8.20..v8.21

To summarize the 173 gnulib-related changes, run these commands
From a git-cloned coreutils directory:
  git checkout v8.21
  git submodule summary v8.20

==

Here are the compressed sources and a GPG detached signature[*]:
  http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz
  http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz.sig

Use a mirror for higher download bandwidth:
  http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz
  http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz.sig

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

  gpg --verify coreutils-8.21.tar.xz.sig

If that command fails because you don't have the required public key,
then run this command to import it:

  gpg --keyserver keys.gnupg.net --recv-keys DF6FD971306037D9

and rerun the 'gpg --verify' command.

This release was bootstrapped with the following tools:
  Autoconf 2.68
  Automake 1.11.6
  Gnulib v0.0-7848-g4a82904
  Bison 2.4.3

NEWS

* Noteworthy changes in release 8.21 (2013-02-14) [stable]

** New programs

  numfmt: reformat numbers

** New features

  df now accepts the --output[=FIELD_LIST] option to define the list of columns
  to include in the output, or all available columns if the FIELD_LIST is
  omitted.  Note this enables df to output both block and inode fields together.

  du now accepts the --threshold=SIZE option to restrict the output to entries
  with such a minimum SIZE (or a maximum SIZE if it is negative).
  du recognizes -t SIZE as equivalent, for compatibility with FreeBSD.

** Bug fixes

  cp --no-preserve=mode now no longer exits non-zero.
  [bug introduced in coreutils-8.20]

  cut with a range like N- no longer allocates N/8 bytes.  That buffer
  would never be used, and allocation failure could cause cut to fail.
  [bug introduced in coreutils-8.10]

  cut no longer accepts the invalid range 0-, which made it print empty lines.
  Instead, cut now fails and emits an appropriate diagnostic.
  [This bug was present in the beginning.]

  cut now handles overlapping to-EOL ranges properly.  Before, it would
  interpret -b2-,3- like -b3-.  Now it's treated like -b2-.
  [This bug was present in the beginning.]

  cut no longer prints extraneous delimiters when a to-EOL range subsumes
  another range.  Before, echo 123|cut --output-delim=: -b2-,3 would print
  2:3.  Now it prints 23.  [bug introduced in 5.3.0]

  cut -f no longer inspects input line N+1 before fully outputting line N,
  which avoids delayed output for intermittent input.
  [bug introduced in TEXTUTILS-1_8b]

  factor no longer loops infinitely on 32 bit powerpc or sparc systems.
  [bug introduced in coreutils-8.20]

  install -m M SOURCE DEST no longer has a race condition where DEST's
  permissions are temporarily derived from SOURCE instead of from M.

  pr -n no longer crashes when passed values = 32.  Also, line numbers are
  consistently padded with spaces, rather than with zeros for certain widths.
  [bug introduced in TEXTUTILS-1_22i]

  seq -w ensures that for numbers input in scientific notation,
  the output numbers are properly aligned and of the correct width.
  [This bug was present in the beginning.]

  seq -w ensures correct alignment when the step value includes a precision
  while the start value does not, and the number sequence 

coreutils-8.21 released [stable]

2013-02-14 Thread Pádraig Brady

This is to announce coreutils-8.21, a stable release.

There have been 121 commits by 18 people in the 16 weeks since 8.20.

Executive summary: 8.21 is mainly a bug fix release, including fixes
for recent regressions in cp, factor and seq.  cut has received fixes for
many long standing issues.  df is updated to better handle newer systems
that link /etc/mtab to /proc/mounts, and also provides a new --output
option to control which fields to display.  A new numfmt utility was
included to provide various number formatting and conversion functions.

See the NEWS below for a brief summary.

Thanks to everyone who has contributed!
The following people contributed changes to this release:

  Assaf Gordon (7):
  Benno Schulenberg (6):
  Bernhard Voelker (24):
  Cojocaru Alexandru (2):
  Colin Watson (1):
  Daniel Schepler (1):
  Jakob Truelsen (1):
  Jim Meyering (13):
  Karl Berry (2):
  Mike Frysinger (2):
  Ondrej Oprala (2):
  Ondřej Vašík (1):
  Paul Eggert (12):
  Pádraig Brady (46):
  Stefano Lattarini (1):
  Stephan Krempel (1):
  Zartaj Majeed (1):
  Ángel González (1):

Pádraig [on behalf of the coreutils maintainers]

==

Here is the GNU coreutils home page:
http://gnu.org/s/coreutils/

For a summary of changes and contributors, see:
  http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=shortlog;h=v8.21
or run this command from a git-cloned coreutils directory:
  git shortlog v8.20..v8.21

To summarize the 173 gnulib-related changes, run these commands
From a git-cloned coreutils directory:
  git checkout v8.21
  git submodule summary v8.20

==

Here are the compressed sources and a GPG detached signature[*]:
  http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz
  http://ftp.gnu.org/gnu/coreutils/coreutils-8.21.tar.xz.sig

Use a mirror for higher download bandwidth:
  http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz
  http://ftpmirror.gnu.org/coreutils/coreutils-8.21.tar.xz.sig

[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact.  First, be sure to download both the .sig file
and the corresponding tarball.  Then, run a command like this:

  gpg --verify coreutils-8.21.tar.xz.sig

If that command fails because you don't have the required public key,
then run this command to import it:

  gpg --keyserver keys.gnupg.net --recv-keys DF6FD971306037D9

and rerun the 'gpg --verify' command.

This release was bootstrapped with the following tools:
  Autoconf 2.68
  Automake 1.11.6
  Gnulib v0.0-7848-g4a82904
  Bison 2.4.3

NEWS

* Noteworthy changes in release 8.21 (2013-02-14) [stable]

** New programs

  numfmt: reformat numbers

** New features

  df now accepts the --output[=FIELD_LIST] option to define the list of columns
  to include in the output, or all available columns if the FIELD_LIST is
  omitted.  Note this enables df to output both block and inode fields together.

  du now accepts the --threshold=SIZE option to restrict the output to entries
  with such a minimum SIZE (or a maximum SIZE if it is negative).
  du recognizes -t SIZE as equivalent, for compatibility with FreeBSD.

** Bug fixes

  cp --no-preserve=mode now no longer exits non-zero.
  [bug introduced in coreutils-8.20]

  cut with a range like N- no longer allocates N/8 bytes.  That buffer
  would never be used, and allocation failure could cause cut to fail.
  [bug introduced in coreutils-8.10]

  cut no longer accepts the invalid range 0-, which made it print empty lines.
  Instead, cut now fails and emits an appropriate diagnostic.
  [This bug was present in the beginning.]

  cut now handles overlapping to-EOL ranges properly.  Before, it would
  interpret -b2-,3- like -b3-.  Now it's treated like -b2-.
  [This bug was present in the beginning.]

  cut no longer prints extraneous delimiters when a to-EOL range subsumes
  another range.  Before, echo 123|cut --output-delim=: -b2-,3 would print
  2:3.  Now it prints 23.  [bug introduced in 5.3.0]

  cut -f no longer inspects input line N+1 before fully outputting line N,
  which avoids delayed output for intermittent input.
  [bug introduced in TEXTUTILS-1_8b]

  factor no longer loops infinitely on 32 bit powerpc or sparc systems.
  [bug introduced in coreutils-8.20]

  install -m M SOURCE DEST no longer has a race condition where DEST's
  permissions are temporarily derived from SOURCE instead of from M.

  pr -n no longer crashes when passed values = 32.  Also, line numbers are
  consistently padded with spaces, rather than with zeros for certain widths.
  [bug introduced in TEXTUTILS-1_22i]

  seq -w ensures that for numbers input in scientific notation,
  the output numbers are properly aligned and of the correct width.
  [This bug was present in the beginning.]

  seq -w ensures correct alignment when the step value includes a precision
  while the start value does not, and the number sequence 

[PATCH] join: Add -z option

2013-02-14 Thread Assaf Gordon
Hello,

This patch add -z to join, supporting joining zero-terminated lines.
The patch is heavily based on James Youngman's patch of adding -z to uniq 
(commit e062524).

-gordon

P.S.
This patch is independent of the key-comparison patches discussed recently, 
though I'm also adding it there.
From 525eb72b150ed34d3bfcfe453d1494fe28a824b7 Mon Sep 17 00:00:00 2001
From: Assaf Gordon assafgor...@gmail.com
Date: Thu, 14 Feb 2013 15:29:08 -0500
Subject: [PATCH] join: Add -z option

* NEWS: Mention join's new option: --zero-terminated (-z).
* src/join.c: Add new option, --zero-terminated (-z), to make
join use the NUL byte as separator/delimiter rather than newline.
(get_line): Use readlinebuffer_delim in place of readlinebuffer.
(main): Handle the new option.
(usage): Describe new option the same way sort does.
* doc/coreutils.texi (join invocation): Describe the new option.
* tests/misc/join.pl: add tests for -z option.
---
 NEWS   |6 ++
 doc/coreutils.texi |   17 +
 src/join.c |   19 +++
 tests/misc/join.pl |   20 
 4 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 37bcdf7..618c1da 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,12 @@ GNU coreutils NEWS-*- outline -*-
 
 * Noteworthy changes in release ?.? (-??-??) [?]
 
+** New features
+
+  join accepts a new option: --zero-terminated (-z). As with the sort,uniq
+  option of the same name, this makes join consume and produce NUL-terminated
+  lines rather than newline-terminated lines.
+
 
 * Noteworthy changes in release 8.21 (2013-02-14) [stable]
 
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 2c16dc4..a72d9ce 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -6059,6 +6059,10 @@ available; the sort order can be any order that considers two fields
 to be equal if and only if the sort comparison described above
 considers them to be equal.  For example:
 
+Input and output lines are terminated with a newline character unless the
+@option{--zero-terminated} (@option{-z}) is used, in which case lines are
+@sc{nul} terminated.
+
 @example
 $ cat file1
 a a1
@@ -6181,6 +6185,19 @@ character is used to delimit the fields.
 Print a line for each unpairable line in file @var{file-number}
 (either @samp{1} or @samp{2}), instead of the normal output.
 
+@item -z
+@itemx --zero-terminated
+@opindex -z
+@opindex --zero-terminated
+@cindex join zero-terminated lines
+Treat the input as a set of lines, each terminated by a null character
+(ASCII @sc{nul}) instead of a line feed
+(ASCII @sc{lf}).
+This option can be useful in conjunction with @samp{sort -z}, @samp{uniq -z},
+@samp{perl -0} or @samp{find -print0} and @samp{xargs -0} which do the same in
+order to reliably handle arbitrary file names (even those containing blanks
+or other special characters).
+
 @end table
 
 @exitstatus
diff --git a/src/join.c b/src/join.c
index 11e647c..1810ac2 100644
--- a/src/join.c
+++ b/src/join.c
@@ -161,6 +161,7 @@ static struct option const longopts[] =
   {ignore-case, no_argument, NULL, 'i'},
   {check-order, no_argument, NULL, CHECK_ORDER_OPTION},
   {nocheck-order, no_argument, NULL, NOCHECK_ORDER_OPTION},
+  {zero-terminated, no_argument, NULL, 'z'},
   {header, no_argument, NULL, HEADER_LINE_OPTION},
   {GETOPT_HELP_OPTION_DECL},
   {GETOPT_VERSION_OPTION_DECL},
@@ -177,6 +178,9 @@ static bool ignore_case;
join them without checking for ordering */
 static bool join_header_lines;
 
+/* The character marking end of line. Default to \n. */
+static char eolchar = '\n';
+
 void
 usage (int status)
 {
@@ -213,6 +217,9 @@ by whitespace.  When FILE1 or FILE2 (not both) is -, read standard input.\n\
   --header  treat the first line in each file as field headers,\n\
   print them without trying to pair them\n\
 ), stdout);
+  fputs (_(\
+  -z, --zero-terminated end lines with 0 byte, not newline\n\
+), stdout);
   fputs (HELP_OPTION_DESCRIPTION, stdout);
   fputs (VERSION_OPTION_DESCRIPTION, stdout);
   fputs (_(\
@@ -445,7 +452,7 @@ get_line (FILE *fp, struct line **linep, int which)
   else
 line = init_linep (linep);
 
-  if (! readlinebuffer (line-buf, fp))
+  if (! readlinebuffer_delim (line-buf, fp, eolchar))
 {
   if (ferror (fp))
 error (EXIT_FAILURE, errno, _(read error));
@@ -614,7 +621,7 @@ prjoin (struct line const *line1, struct line const *line2)
 break;
   putchar (output_separator);
 }
-  putchar ('\n');
+  putchar (eolchar);
 }
   else
 {
@@ -636,7 +643,7 @@ prjoin (struct line const *line1, struct line const *line2)
   prfields (line1, join_field_1, autocount_1);
   prfields (line2, join_field_2, autocount_2);
 
-  putchar ('\n');
+  putchar (eolchar);
 }
 }
 
@@ -1017,7 +1024,7 @@ main (int argc, char **argv)
   issued_disorder_warning[0] = 

sort/uniq/join: key-comparison code consolidation

2013-02-14 Thread Assaf Gordon
Hello,

( new thread for previous topic 
http://lists.gnu.org/archive/html/coreutils/2013-02/msg00082.html ) .

The attached patch contains:

1. src/key-spec-parsing.{h,c} - key comparison code, previously in sort.c

2. uniq - now supports --key (multiple keys, too).
Same as before, but rebased against 8.21.
Supported orders:
  -k1,1  = ascii
  -k1b,1 = ignore-blanks
  -k1d,1 = dictionary
  -k1i,1 = non-printing
  -k1f,1 = ignore-case
  -k1n,1 = fast-numeric
  -k1g,1 = general-numeric
  -k1M,1 = month
also supports user-specified delimiter (default: white-space).

Related discussions:
  http://debbugs.gnu.org/cgi/bugreport.cgi?bug=5832
  http://debbugs.gnu.org/cgi/bugreport.cgi?bug=7068
  http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00211.html

3. sort - same functionality as before, but key-comparison code extracted to a 
different file.

4. join - internally uses the key-comparison code.
Does not support the --key parameter (uses the standard -j/-1/-2),
but accepts new arguments that affect joining order:
 -r --reverse
 -n --numeric-sort
 -d --dictionary-order
 -g --general-numeric

Related discussions:
 http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6903
 http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6366

As an option, perhaps we can support new -k that will be like -j but allow 
specificity options
(e.g. -k1nr will be equivalent to -j 1 --numeric --reverse).
 

It'll be easy to add human-numeric-sort/version-sort to join/uniq, but I'm not 
sure if they make sense.


Regards,
 -gordon




key_compare7.patch.xz
Description: application/xz


Re: [PATCH] join: Add -z option

2013-02-14 Thread Pádraig Brady

On 02/14/2013 08:51 PM, Assaf Gordon wrote:

Hello,

This patch add -z to join, supporting joining zero-terminated lines.
The patch is heavily based on James Youngman's patch of adding -z to uniq 
(commit e062524).

-gordon

P.S.
This patch is independent of the key-comparison patches discussed recently, 
though I'm also adding it there.



This make sense under the general theme of consolidating sort,uniq,join options.

thanks,
Pádraig.