bug#9157: [PATCH] dd: sparse conv flag

2011-12-12 Thread Roman Rybalko (devel)
On 10.12.2011 14:09, Jim Meyering wrote:
 Here are some things we'll have to consider before
 adding a new hole-punching option to dd:

 Your patch may create a hole in the destination for each sequence of
 length seek_size or greater of zero bytes in the input.
 As you may have seen in the cp-related discussion, one may
 want different options:
   - preserve a file's hole/non-hole structure
   - efficiently detect existing holes and fill them with explicit zeros in 
 dest
   - efficiently detect existing holes and seek-in-dest for each sequence of
 zeros (longer than some minimum) in non-hole input
Okay, I'll think about that.
That's a clear task.
 diff --git a/doc/coreutils.texi b/doc/coreutils.texi
 index 424446c..761c698 100644
 --- a/doc/coreutils.texi
 +++ b/doc/coreutils.texi
 @@ -8127,6 +8127,10 @@ Pad every input block to size of @samp{ibs} with 
 trailing zero bytes.
  When used with @samp{block} or @samp{unblock}, pad with spaces instead of
  zero bytes.

 +@item sparse
 +@opindex sparse
 +Make sparse output file.
 Please say a little more here.
 I.e., when might a hole be introduced?
 When is this option useful?
Okay.
 @@ -985,6 +990,21 @@ iwrite (int fd, char const *buf, size_t size)
  {
ssize_t nwritten;
process_signals ();
 +  if (conversions_mask  C_SPARSE)
 +{
 +  off_t seek_size = 0;
 +  while (total_written + seek_size  size  buf[total_written + 
 seek_size] == 0)
 +++seek_size;
 +  if (seek_size)
 +{
 +  off_t cur_off = 0;
 +  cur_off = lseek(fd, seek_size, SEEK_CUR);
 +  if (cur_off  0)
 +break;
 dd must not ignore lseek failure.
That's a problem for me.
How would be suitable to handle lseek failure?
Perhaps with new kernel API this code may be obsoleted.
 @@ -0,0 +1,70 @@
 +#!/bin/sh
 +# Ensure that dd conv=sparse works.
 +
 +# Copyright (C) 2003, 2005-2011 Free Software Foundation, Inc.
 Use only 2011 as the copyright year.
Okay.
 +# sometimes we may read less than 1M
 +dd if=/dev/zero of=sample0 count=1 bs=1M 2 /dev/null || fail=1
 +[ `stat -c %s sample0` = 1048576 ] || fail=1
 We'd write that like this instead:
 (note use of test, not [...], use of $(...), not `...`)

 test $(stat -c %s sample0) = 1048576 || fail=1
Okay.

-- 
WBR,
Roman Rybalko






bug#10253: mention +FORMAT in ls time style reminder help blurb

2011-12-12 Thread Jim Meyering
Paul Eggert wrote:
 I like the change, thanks.  A couple of nits:

 On 12/11/11 03:07, Jim Meyering wrote:
 +  fprintf (stderr,
 +   _(  - `+' followed by a date format string\n));

 I suggest supplying an example and quoting date so that it's clearer
 that it's talking about the `date' command.  Something like this, perhaps?

_(  - +FORMAT (e.g., +%H:%M) for a `date'-style format\n)

Thanks.  That is better.

 +fprintf (stderr,   - `[posix-]%s'\n, *p++);

 I suggest removing the ` and ' since they are locale-dependent
 and aren't needed here (plus, that works better with the above
 suggestion).

Good point.  Besides, I'd say that using quotes around syntax including
the likes of `[posix-]...' is misleading in that it might encourage
someone to use the []'s.

 +  fprintf (stderr, _(Valid arguments are:));

 Isn't the usual style to use fputs when there's no directive
 in the format?  There's one other example of this.

Yes, that is my preference, too.
Thanks for pointing it out.

I copied both that format-less fprintf and the `' mark-up from argmatch.c.
The fix there was easy: just use quote (...), since argmatch.c already
includes quote.h, so I've just fixed that in gnulib.

Here's a new version of the patch:
[slightly risky for translators and fuzzy string matchers:
 now there are two very similar strings:

  Valid arguments are:\n (here in ls.c)
  Valid arguments are:   (in gnulib's argmatch.c)

 It'd be easy rework argmatch.c to include the \n.
 ]

Now it prints this:

$ src/ls -l --time-style=x
src/ls: invalid argument `x' for `time style'
Valid arguments are:
  - [posix-]full-iso
  - [posix-]long-iso
  - [posix-]iso
  - [posix-]locale
  - +FORMAT (e.g., +%H:%M) for a `date'-style format
Try `src/ls --help' for more information.



From 79a14f0481df2bd45a20f92ce4c156f42fdae660 Mon Sep 17 00:00:00 2001
From: Jim Meyering meyer...@redhat.com
Date: Sun, 11 Dec 2011 11:59:31 +0100
Subject: [PATCH] ls: give a more useful diagnostic for a bogus --time-style
 arg

* src/ls.c (decode_switches): Replace our use of XARGMATCH
with open-coded version so that we can give a better diagnostic.
Reported by Dan Jacobson in http://bugs.gnu.org/10253
with suggestions from Eric Blake and Paul Eggert.
---
 src/ls.c |   72 +
 1 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index 0d64bab..672237a 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2039,33 +2039,57 @@ decode_switches (int argc, char **argv)
   long_time_format[1] = p1;
 }
   else
-switch (XARGMATCH (time style, style,
-   time_style_args,
-   time_style_types))
-  {
-  case full_iso_time_style:
-long_time_format[0] = long_time_format[1] =
-  %Y-%m-%d %H:%M:%S.%N %z;
-break;
+{
+  ptrdiff_t res = argmatch (style, time_style_args,
+(char const *) time_style_types,
+sizeof (*time_style_types));
+  if (res  0)
+{
+  /* This whole block used to be a simple use of XARGMATCH.
+ but that didn't print the posix--prefixed variants or
+ the +-prefixed format string option upon failure.  */
+  argmatch_invalid (time style, style, res);
+
+  /* The following is a manual expansion of argmatch_valid,
+ but with the added + ... description and the [posix-]
+ prefixes prepended.  Note that this simplification works
+ only because all four existing time_style_types values
+ are distinct.  */
+  fputs (_(Valid arguments are:\n), stderr);
+  char const *const *p = time_style_args;
+  while (*p)
+fprintf (stderr,   - [posix-]%s\n, *p++);
+  fputs (_(  - +FORMAT (e.g., +%H:%M) for a `date'-style
+format\n), stderr);
+  usage (LS_FAILURE);
+}
+  switch (res)
+{
+case full_iso_time_style:
+  long_time_format[0] = long_time_format[1] =
+%Y-%m-%d %H:%M:%S.%N %z;
+  break;

-  case long_iso_time_style:
-long_time_format[0] = long_time_format[1] = %Y-%m-%d %H:%M;
-break;
+case long_iso_time_style:
+  long_time_format[0] = long_time_format[1] = %Y-%m-%d %H:%M;
+  break;

-  case iso_time_style:
-long_time_format[0] = %Y-%m-%d ;
-long_time_format[1] = %m-%d %H:%M;
-break;
+case iso_time_style:
+  long_time_format[0] = %Y-%m-%d ;
+  long_time_format[1] = %m-%d %H:%M;
+  break;
+
+case 

bug#10253: mention +FORMAT in ls time style reminder help blurb

2011-12-12 Thread Jim Meyering
Jim Meyering wrote:

 Paul Eggert wrote:
 I like the change, thanks.  A couple of nits:

 On 12/11/11 03:07, Jim Meyering wrote:
 +  fprintf (stderr,
 +   _(  - `+' followed by a date format string\n));

 I suggest supplying an example and quoting date so that it's clearer
 that it's talking about the `date' command.  Something like this, perhaps?

_(  - +FORMAT (e.g., +%H:%M) for a `date'-style format\n)

 Thanks.  That is better.

 +fprintf (stderr,   - `[posix-]%s'\n, *p++);

 I suggest removing the ` and ' since they are locale-dependent
 and aren't needed here (plus, that works better with the above
 suggestion).

 Good point.  Besides, I'd say that using quotes around syntax including
 the likes of `[posix-]...' is misleading in that it might encourage
 someone to use the []'s.

 +  fprintf (stderr, _(Valid arguments are:));

 Isn't the usual style to use fputs when there's no directive
 in the format?  There's one other example of this.

 Yes, that is my preference, too.
 Thanks for pointing it out.

 I copied both that format-less fprintf and the `' mark-up from argmatch.c.
 The fix there was easy: just use quote (...), since argmatch.c already
 includes quote.h, so I've just fixed that in gnulib.

 Here's a new version of the patch:
 [slightly risky for translators and fuzzy string matchers:
  now there are two very similar strings:

   Valid arguments are:\n (here in ls.c)
   Valid arguments are:   (in gnulib's argmatch.c)

  It'd be easy rework argmatch.c to include the \n.
  ]

 Now it prints this:

 $ src/ls -l --time-style=x
 src/ls: invalid argument `x' for `time style'
 Valid arguments are:
   - [posix-]full-iso
   - [posix-]long-iso
   - [posix-]iso
   - [posix-]locale
   - +FORMAT (e.g., +%H:%M) for a `date'-style format
 Try `src/ls --help' for more information.

I wrote a test, amended the preceding to include it
and pushed this result:

From a3fee8b6afdbb70317d2124d5a3bb0d2887ab31b Mon Sep 17 00:00:00 2001
From: Jim Meyering meyer...@redhat.com
Date: Sun, 11 Dec 2011 11:59:31 +0100
Subject: [PATCH] ls: give a more useful diagnostic for a bogus --time-style
 arg

* src/ls.c (decode_switches): Replace our use of XARGMATCH
with open-coded version so that we can give a better diagnostic.
* tests/ls/time-style-diag: New file.
* tests/Makefile.am (TESTS): Add it.
Reported by Dan Jacobson in http://bugs.gnu.org/10253
with suggestions from Eric Blake and Paul Eggert.
---
 gnulib   |2 +-
 src/ls.c |   72 ++---
 tests/Makefile.am|1 +
 tests/ls/time-style-diag |   39 +
 4 files changed, 89 insertions(+), 25 deletions(-)
 create mode 100755 tests/ls/time-style-diag

diff --git a/gnulib b/gnulib
index a5f6df2..f5c2e2a 16
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit a5f6df2b1f3f0fdc73635de3ad285d21703dab18
+Subproject commit f5c2e2ac7d4ca2f6ba15e56a245f348899360a00
diff --git a/src/ls.c b/src/ls.c
index 0d64bab..672237a 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2039,33 +2039,57 @@ decode_switches (int argc, char **argv)
   long_time_format[1] = p1;
 }
   else
-switch (XARGMATCH (time style, style,
-   time_style_args,
-   time_style_types))
-  {
-  case full_iso_time_style:
-long_time_format[0] = long_time_format[1] =
-  %Y-%m-%d %H:%M:%S.%N %z;
-break;
+{
+  ptrdiff_t res = argmatch (style, time_style_args,
+(char const *) time_style_types,
+sizeof (*time_style_types));
+  if (res  0)
+{
+  /* This whole block used to be a simple use of XARGMATCH.
+ but that didn't print the posix--prefixed variants or
+ the +-prefixed format string option upon failure.  */
+  argmatch_invalid (time style, style, res);
+
+  /* The following is a manual expansion of argmatch_valid,
+ but with the added + ... description and the [posix-]
+ prefixes prepended.  Note that this simplification works
+ only because all four existing time_style_types values
+ are distinct.  */
+  fputs (_(Valid arguments are:\n), stderr);
+  char const *const *p = time_style_args;
+  while (*p)
+fprintf (stderr,   - [posix-]%s\n, *p++);
+  fputs (_(  - +FORMAT (e.g., +%H:%M) for a `date'-style
+format\n), stderr);
+  usage (LS_FAILURE);
+}
+  switch (res)
+{
+case full_iso_time_style:
+  long_time_format[0] = long_time_format[1] =
+%Y-%m-%d %H:%M:%S.%N %z;
+  break;

-   

bug#10253: mention +FORMAT in ls time style reminder help blurb

2011-12-12 Thread jidanni
 JM == Jim Meyering j...@meyering.net writes:
JM is misleading in that it might encourage someone to use the []'s.
Like You Know Who :-), who also recommends you figure out a way to get
rid of the
JM $ src/ls -l --time-style=x
src/ even in these e-mails/commits, as it is bad for the eye, even though it
yes surely disappears in production.
Having walked from the airport... I'll be dead soon
http://www.youtube.com/watch?v=Tp8XcAKYsKolist=PL6E40919035151385





bug#10253: mention +FORMAT in ls time style reminder help blurb

2011-12-12 Thread Jim Meyering
jida...@jidanni.org wrote:
 JM $ src/ls -l --time-style=x
 src/ even in these e-mails/commits, as it is bad for the eye, even though it
 yes surely disappears in production.

Actually, using the src/ prefix (or some prefix, like ./)
is important.  Otherwise, I'm not testing what I've just built.
Hence, including that prefix shows what I've done.  Without it,
the reader would wonder if I'd simply used whatever ls is in my path,
which could easily represent a mistake.





bug#10253: mention +FORMAT in ls time style reminder help blurb

2011-12-12 Thread jidanni
 JM == Jim Meyering j...@meyering.net writes:
JM Hence, including that prefix shows what I've done.  Without it,
JM the reader would wonder if I'd simply used whatever ls is in my path,
JM which could easily represent a mistake.
Well OK then.





bug#10281: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Paul Eggert
On 12/12/11 04:50, Kamil Dudka wrote:
 Was such a change in behavior intended?  I am asking as I was not able to
 find it documented anywhere.

It was intended, as it provides useful functionality that
can't be done if hard links aren't tracked across arguments,
whereas the reverse isn't true.  It's documented in
http://www.gnu.org/software/coreutils/manual/coreutils.html#du-invocation,
which says:

  If two or more hard links point to the same file,
  only one of the hard links is counted. The file
  argument order affects which links are counted,
  and changing the argument order may change the
  numbers that du outputs.

Perhaps this isn't sufficiently clear, and if so,
suggestions for improvements are welcome.





bug#10281: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Jim Meyering
Paul Eggert wrote:
 On 12/12/11 04:50, Kamil Dudka wrote:
 Was such a change in behavior intended?  I am asking as I was not able to
 find it documented anywhere.

 It was intended, as it provides useful functionality that
 can't be done if hard links aren't tracked across arguments,
 whereas the reverse isn't true.  It's documented in
 http://www.gnu.org/software/coreutils/manual/coreutils.html#du-invocation,
 which says:

   If two or more hard links point to the same file,
   only one of the hard links is counted. The file
   argument order affects which links are counted,
   and changing the argument order may change the
   numbers that du outputs.

 Perhaps this isn't sufficiently clear, and if so,
 suggestions for improvements are welcome.

FYI, Kamil's original mail never to have reached the mailing list[*],
in spite of reaching debbugs and acquiring a bug number and then going
on to reach Paul (the Cc'd recipient).

Kamil or Paul, would you please post the original, for the record?

Jim

[*] Ward Vandewege confirmed that the message reached debbugs but
somehow was not passed on to eggs.gnu.org.





bug#10281: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Paul Eggert
On 12/12/11 10:09, Jim Meyering wrote:

 Kamil or Paul, would you please post the original, for the record?

Sure, here's a copy:

From: Kamil Dudka kdu...@redhat.com
To: bug-coreutils@gnu.org
Subject: change in behavior of du with multiple arguments (commit efe53cc)
Date: Mon, 12 Dec 2011 13:50:30 +0100
Cc: Paul Eggert egg...@cs.ucla.edu
Message-Id: 201112121350.30539.kdu...@redhat.com

Hi,

the following upstream commit introduces a major change in behavior of du
when multiple arguments are specified:

http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=efe53cc

... and the issue has landed as a bug in our Bugzilla:

https://bugzilla.redhat.com/747075#c3

Was such a change in behavior intended?  I am asking as I was not able to
find it documented anywhere.  The up2date man page states:

Summarize disk usage of each FILE, recursively for directories.

..., where FILE refers to a single argument given to du.  The info 
documentation states:

The FILE argument order affects which links are counted, and changing the
argument order may change the numbers that `du' outputs.

However, changing the numbers is one thing and missing lines in the output
of du is quite another thing.

Could anybody please clarify the current behavior of du?  Thanks in advance!

Kamil





bug#10282: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Kamil Dudka
Hi,

the following upstream commit introduces a major change in behavior of du
when multiple arguments are specified:

http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=efe53cc

... and the issue has landed as a bug in our Bugzilla:

https://bugzilla.redhat.com/747075#c3

Was such a change in behavior intended?  I am asking as I was not able to
find it documented anywhere.  The up2date man page states:

Summarize disk usage of each FILE, recursively for directories.

..., where FILE refers to a single argument given to du.  The info 
documentation states:

The FILE argument order affects which links are counted, and changing the
argument order may change the numbers that `du' outputs.

However, changing the numbers is one thing and missing lines in the output
of du is quite another thing.

Could anybody please clarify the current behavior of du?  Thanks in advance!

Kamil





bug#10281: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Bob Proulx
Jim Meyering wrote:
 FYI, Kamil's original mail never to have reached the mailing list[*],

It was sitting in the debbugs-submit queue waiting for a human.  I
reviewed the queues a few minutes ago and sent it through.  At least I
am pretty sure it was the same message I saw there.  I hadn't realized
it was something to note until after I read the thread here just now.
Here is the interesting part of the trail.

  Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
  by debbugs.gnu.org with esmtp (Exim 4.69)
  (envelope-from debbugs-submit-boun...@debbugs.gnu.org)
  id 1RaC73-0002OB-Uh
  for sub...@debbugs.gnu.org; Mon, 12 Dec 2011 15:04:40 -0500
  Received: from eggs.gnu.org ([140.186.70.92])
  by debbugs.gnu.org with esmtp (Exim 4.69)
  (envelope-from kdu...@redhat.com) id 1Ra5Mj-gC-LJ
  for sub...@debbugs.gnu.org; Mon, 12 Dec 2011 07:52:23 -0500

 in spite of reaching debbugs and acquiring a bug number

That does seem strange since I didn't think it got a bug number until
after it went through debbugs.  It had a bug number and so it must
have already gone through debbugs.  It must work differently from that
somehow.

 and then going on to reach Paul (the Cc'd recipient).

Of course the CC would be a direct message outside of any of the bug
tracking and mailing lists.

Bob





bug#10281: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Jim Meyering
Bob Proulx wrote:
 Jim Meyering wrote:
 FYI, Kamil's original mail never to have reached the mailing list[*],

 It was sitting in the debbugs-submit queue waiting for a human.  I
 reviewed the queues a few minutes ago and sent it through.  At least I
 am pretty sure it was the same message I saw there.  I hadn't realized
 it was something to note until after I read the thread here just now.
 Here is the interesting part of the trail.

   Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
   by debbugs.gnu.org with esmtp (Exim 4.69)
   (envelope-from debbugs-submit-boun...@debbugs.gnu.org)
   id 1RaC73-0002OB-Uh
   for sub...@debbugs.gnu.org; Mon, 12 Dec 2011 15:04:40 -0500
   Received: from eggs.gnu.org ([140.186.70.92])
   by debbugs.gnu.org with esmtp (Exim 4.69)
   (envelope-from kdu...@redhat.com) id 1Ra5Mj-gC-LJ
   for sub...@debbugs.gnu.org; Mon, 12 Dec 2011 07:52:23 -0500

 in spite of reaching debbugs and acquiring a bug number

 That does seem strange since I didn't think it got a bug number until
 after it went through debbugs.  It had a bug number and so it must
 have already gone through debbugs.  It must work differently from that
 somehow.

 and then going on to reach Paul (the Cc'd recipient).

 Of course the CC would be a direct message outside of any of the bug
 tracking and mailing lists.

Thanks, Bob.
I forgot about the debbugs queue.
I checked only the bug-coreutils mailman admin queue.





bug#10282: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Eric Blake
On 12/12/2011 05:50 AM, Kamil Dudka wrote:
 Hi,
 
 the following upstream commit introduces a major change in behavior of du
 when multiple arguments are specified:
 
 http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=efe53cc
 
 ... and the issue has landed as a bug in our Bugzilla:
 
 https://bugzilla.redhat.com/747075#c3
 
 Was such a change in behavior intended?

A change in behavior was intended, but I think we ended up introducing a
bug in its place.

 The info 
 documentation states:
 
 The FILE argument order affects which links are counted, and changing the
 argument order may change the numbers that `du' outputs.

And this is intended.  The end goal is that if a directory appears both
on the command line and as a child of another directory on the command
line, that it gets counted only once.

 
 However, changing the numbers is one thing and missing lines in the output
 of du is quite another thing.

Yes, that's the bug I think we introduced - we are mistakenly eliding
lines of output, rather than listing those directories with 0 attributed
additional size.

More importantly, POSIX says of -s:

−s Instead of the default output, report only the total sum for each of
the specified files.

But we fail that:

$ mkdir -p /tmp/a/b
$ cd /tmp/a
$ du -s . b
8   .
$ du -s b .
4   b
4   .

We correctly deduced that only 8 units were occupied (that is, b was not
double-counted in either approach), but we _failed_ to list b in the
first approach.  I think POSIX requires the output to have been:

$ du -s . b
8   .
0   b

as an indication that we did visit b, but that there were no additional
contributions to the disk usage encountered during our visit there.

Meanwhile, without -s, I still think we elided too much data:

$ du . b
4   ./b
8   .
$ du b .
4   b
4   .

In the first case, we recursed into ./b, then back out to ., but elided
any notion that we ever directly visited b.  In the second case, we
visited b, then recursed into ./b but had nothing to output, then back
out to '.'.  I think that a saner output would be:

$ du . b
4   ./b
8   .
0   b
$ du b .
4   b
0   ./b
4   .

to make it obvious that we pruned recursion at points where we
encountered duplicates, and that the sum of the first columns shows an
accurate disk usage.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#10282: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Eric Blake
On 12/12/2011 03:33 PM, Eric Blake wrote:
 However, changing the numbers is one thing and missing lines in the output
 of du is quite another thing.
 
 Yes, that's the bug I think we introduced - we are mistakenly eliding
 lines of output, rather than listing those directories with 0 attributed
 additional size.
 
 More importantly, POSIX says of -s:
 
 −s Instead of the default output, report only the total sum for each of
 the specified files.
 
 But we fail that:
 
 $ mkdir -p /tmp/a/b
 $ cd /tmp/a
 $ du -s . b
 8 .
 $ du -s b .
 4 b
 4 .
 
 We correctly deduced that only 8 units were occupied (that is, b was not
 double-counted in either approach), but we _failed_ to list b in the
 first approach.  I think POSIX requires the output to have been:
 
 $ du -s . b
 8   .
 0   b

POSIX also says:

Files with multiple links shall be counted and written for only one
entry. The directory entry that is selected in the report is unspecified.

But even historically, command line arguments were always listed, even
if they are otherwise multiple links.  On Solaris 10, for example,

$ touch a
$ ln a b
$ /bin/du a b
1   a
1   b

instead of omitting one of the two entries.  The omission only occurs
during recursion of a directory on the command line:

$ /bin/du -a .
1   ./b
4   .

 I think that a saner output would be:
 
 $ du . b
 4   ./b
 8   .
 0   b

So this would be okay (even though we encountered b via two different
links, the second encounter was a command line, so it should not be
elided entirely, but listing 0 would make it obvious that there is no
further disk usage to count),

 $ du b .
 4   b
 0   ./b
 4   .

whereas this proposed line of '0 ./b' is questionable (we could argue
that ./b should not be elided because no other link to b was printed
during recursion, or we could argue that elision should trump recursion
once the command line arguments have been printed).

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#10282: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Paul Eggert
On 12/12/11 14:58, Eric Blake wrote:
 Files with multiple links shall be counted and written for only one
 entry. The directory entry that is selected in the report is unspecified.

Yes, that's partly what motivates the current GNU du behavior:
the idea is to implement this notion consistently (historical
'du' implementations do not).

 But even historically, command line arguments were always listed, even
 if they are otherwise multiple links.

I suppose we could change GNU 'du' to output 0 X for a command-line
argument X that's already been seen.  This wouldn't address the problem
perceived by the original poster, though.  And it's a glitch from the
point of view of consistency.

Perhaps 'du' needs a new option to control what to do with
files that 'du' has already seen before. something that
generalizes --count-links.





bug#7999: [coreutils-8.x] documentation of touch command needs clarification

2011-12-12 Thread Paul Eggert
I installed the following patch to try to document this issue better
and am taking the liberty of marking this as done.  Further comments
are welcome (and we can reopen the bug as needed).

doc: document 'touch' and timestamps better
* doc/coreutils.texi (touch invocation): Explain file timestamps
better.  Problem reported by Nelson H.F. Beebe (Bug#7999).
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 369fad2..c26a53d 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -7199,6 +7199,7 @@ a date like @samp{Mar 30@ @ 2002} for non-recent 
timestamps, and a
 date-without-year and time like @samp{Mar 30 23:45} for recent timestamps.
 This format can change depending on the current locale as detailed below.

+@cindex clock skew
 A timestamp is considered to be @dfn{recent} if it is less than six
 months old, and is not dated in the future.  If a timestamp dated
 today is not listed in recent form, the timestamp is in the future,
@@ -10261,11 +10262,39 @@ A @var{file} argument string of @samp{-} is handled 
specially and
 causes @command{touch} to change the times of the file associated with
 standard output.

+@cindex clock skew
+By default, @command{touch} sets file timestamps to the current time.
+Because @command{touch} acts on its operands left to right, the
+resulting timestamps of earlier and later operands may disagree.
+Also, the determination of what time is ``current'' depends on the
+platform.  Platforms with network file systems often use different
+clocks for the operating system and for file systems; because
+@command{touch} typically uses file systems' clocks by default, clock
+skew can cause the resulting file timestamps to appear to be in a
+program's ``future'' or ``past''.
+
+@cindex file timestamp resolution
+The @command{touch} command sets the file's timestamp to the greatest
+representable value that is not greater than the requested time.  This
+can differ from the requested time for several reasons.  First, the
+requested time may have a higher resolution than supported.  Second, a
+file system may use different resolutions for different types of
+times.  Third, file timestamps may use a different resolution than
+operating system timestamps.  Fourth, the operating system primitives
+used to update timestamps may employ yet a different resolution.  For
+example, in theory a file system might use 10-microsecond resolution
+for access time and 100-nanosecond resolution for modification time,
+and the operating system might use nanosecond resolution for the
+current time and microsecond resolution for the primitive that
+@command{touch} uses to set a file's timestamp to an arbitrary value.
+
 @cindex permissions, for changing file timestamps
-If changing both the access and modification times to the current
-time, @command{touch} can change the timestamps for files that the user
-running it does not own but has write permission for.  Otherwise, the
-user must own the files.
+When setting file timestamps to the current time, @command{touch} can
+change the timestamps for files that the user does not own but has
+write permission for.  Otherwise, the user must own the files.  Some
+older systems have a further restriction: the user must own the files
+unless both the access and modification times are being set to the
+current time.

 Although @command{touch} provides options for changing two of the times---the
 times of last access and modification---of a file, there is actually





bug#10287: [wishlist] uniq can remove non adjacent lines

2011-12-12 Thread Stéphane Blondon
Tool: uniq
Priority: wishlist

Hello,

I think `uniq` should have an additional option (for example -a,
--all) to remove same lines but not adjacent.

The man page explains a workaround based on `sort` but it can be
complex to use. Few weeks ago, I had to `uniq`-ize random numbers and
the sort couldn't really work. Fortunately, the order was not
important so using `sort | uniq | sort --random-sort` was an
acceptable solution. I imagine cases based on other tools like `top`
could be a problem too.

If you are interested, I could try to provide a patch. (I have learnt
C but I don't use it today.)

I don't think the increase of memory use is a problem today, so a
warning in the manpage should be enought.


Thank for all,
-- 
Stéphane





bug#10287: [wishlist] uniq can remove non adjacent lines

2011-12-12 Thread Bob Proulx
Stéphane Blondon wrote:
 I think `uniq` should have an additional option (for example -a,
 --all) to remove same lines but not adjacent.
 
 The man page explains a workaround based on `sort` but it can be
 complex to use. Few weeks ago, I had to `uniq`-ize random numbers and
 the sort couldn't really work. Fortunately, the order was not
 important so using `sort | uniq | sort --random-sort` was an
 acceptable solution. I imagine cases based on other tools like `top`
 could be a problem too.

If you want to print only the first of a unique line then this perl
one-liner will do it.

  perl -lne 'print $_ if ! defined $a{$_}; $a{$_}=$_;'

Bob





bug#7999: [coreutils-8.x] documentation of touch command needs clarification

2011-12-12 Thread Jim Meyering
Paul Eggert wrote:
 I installed the following patch to try to document this issue better
 and am taking the liberty of marking this as done.  Further comments
 are welcome (and we can reopen the bug as needed).

Nice.  Thanks!





bug#10282: change in behavior of du with multiple arguments (commit efe53cc)

2011-12-12 Thread Jim Meyering
Paul Eggert wrote:

 On 12/12/11 14:58, Eric Blake wrote:
 Files with multiple links shall be counted and written for only one
 entry. The directory entry that is selected in the report is unspecified.

 Yes, that's partly what motivates the current GNU du behavior:
 the idea is to implement this notion consistently (historical
 'du' implementations do not).

 But even historically, command line arguments were always listed, even
 if they are otherwise multiple links.

 I suppose we could change GNU 'du' to output 0 X for a command-line
 argument X that's already been seen.

This seems sensible.

 This wouldn't address the problem
 perceived by the original poster, though.  And it's a glitch from the
 point of view of consistency.

I agree that printing 0 X for these seems inconsistent with the
elision mandated for the second and subsequent encounter of a file,
but I suppose command line arguments are intrinsically different
enough that handling them specially makes sense.  Maybe even as
the default.

 Perhaps 'du' needs a new option to control what to do with
 files that 'du' has already seen before. something that
 generalizes --count-links.

That sounds like a good way to do it.
Anyone interested?