bug#20511: split : does not account for --numeric-suffixes=FROM in calculation of suffix length?

2015-05-06 Thread Pádraig Brady
On 06/05/15 18:37, Ben Rusholme wrote:
 Hi,
 
 4. Auto set the suffix len based on FROM + CHUNK.
 That would support use case 1 (single run),
 but _silently_ break subsequent processing order
 of outputs from multiple split runs
 (as FROM is increased in multiples of CHUNK size).
 We could mitigate the _silent_ breakage though
 by limiting this change to when FROM  CHUNK.

 5. Document in man page and with more detail in info docs
 that -a is recommended when specifying FROM

 So I'll do 4 and 5 I think.
 
 Thanks, that would solve the problem I was having.
 
 Please feel free to end this conversation here, but if you can spare the time 
 I’d be very interested in an example of a multiple split run for my own 
 education/understanding/curiosity? I assume you mean processing subsets of 
 the input, but can’t see how to do that (after experimenting on the command 
 line and searching the documentation) except —number=l/k/n which does know 
 the size of the total set?

Well you could process subsets but even more simply
consider splitting a set of input files in 2,
to a set of output files.

  i=0
  for f in *.dat; do
split -a4 --numeric=$i $f -n2; i=$(($i+2))
  done

(to be truely generic you would set the -a parameter
 based on the number of files and -n).

cheers,
Pádraig.





bug#20511: split : does not account for --numeric-suffixes=FROM in calculation of suffix length?

2015-05-05 Thread Pádraig Brady
On 05/05/15 21:42, Ben Rusholme wrote:
 Hi,
 
 “split” (in the current GNU coreutils 8.23 release) does not account for the 
 optional start index (“split --numeric-suffixes=FROM”) when calculating 
 suffix length.
 
 I couldn’t find any prior reference to this problem in either the bug tracker 
 or mailing list archive.
 
 Thanks, Ben
 
 
 
 $ seq 100  input.txt
 $ split --numeric-suffixes --number=l/100 input.txt
 $ ls
 input.txt  x06  x13  x20  x27  x34  x41  x48  x55  x62  x69  x76  x83  x90  
 x97
 x00x07  x14  x21  x28  x35  x42  x49  x56  x63  x70  x77  x84  x91  
 x98
 x01x08  x15  x22  x29  x36  x43  x50  x57  x64  x71  x78  x85  x92  
 x99
 x02x09  x16  x23  x30  x37  x44  x51  x58  x65  x72  x79  x86  x93
 x03x10  x17  x24  x31  x38  x45  x52  x59  x66  x73  x80  x87  x94
 x04x11  x18  x25  x32  x39  x46  x53  x60  x67  x74  x81  x88  x95
 x05x12  x19  x26  x33  x40  x47  x54  x61  x68  x75  x82  x89  x96
 
 
 $ rm x*
 $ split --numeric-suffixes=1 --number=l/100 input.txt
 split: output file suffixes exhausted
 $ ls
 input.txt  x07  x14  x21  x28  x35  x42  x49  x56  x63  x70  x77  x84  x91  
 x98
 x01x08  x15  x22  x29  x36  x43  x50  x57  x64  x71  x78  x85  x92  
 x99
 x02x09  x16  x23  x30  x37  x44  x51  x58  x65  x72  x79  x86  x93
 x03x10  x17  x24  x31  x38  x45  x52  x59  x66  x73  x80  x87  x94
 x04x11  x18  x25  x32  x39  x46  x53  x60  x67  x74  x81  x88  x95
 x05x12  x19  x26  x33  x40  x47  x54  x61  x68  x75  x82  x89  x96
 x06x13  x20  x27  x34  x41  x48  x55  x62  x69  x76  x83  x90  x97
 $ # Should run from x001 to x100!
 
 
 $ rm x*
 $ split --numeric-suffixes=1 --number=l/101 input.txt
 $ ls
 input.txt  x008  x016  x024  x032  x040  x048  x056  x064  x072  x080  x088  
 x096
 x001   x009  x017  x025  x033  x041  x049  x057  x065  x073  x081  x089  
 x097
 x002   x010  x018  x026  x034  x042  x050  x058  x066  x074  x082  x090  
 x098
 x003   x011  x019  x027  x035  x043  x051  x059  x067  x075  x083  x091  
 x099
 x004   x012  x020  x028  x036  x044  x052  x060  x068  x076  x084  x092  
 x100
 x005   x013  x021  x029  x037  x045  x053  x061  x069  x077  x085  x093  
 x101
 x006   x014  x022  x030  x038  x046  x054  x062  x070  x078  x086  x094
 x007   x015  x023  x031  x039  x047  x055  x063  x071  x079  x087  x095

The info docs say about the --numeric-suffixes option:

  Note specifying a FROM value also disables the default auto suffix
  length expansion described above, and so you may also want to
  specify ‘-a’ to allow suffixes beyond ‘99’.

Now also specifying the fixed number of files with --number
auto sets the suffix length based on the number. I.E. when
you specified -nl/101 it bumped the suffix length to 3

Now you could bump the suffix length based on the start number,
though I don't think we should as that would impact on future
processing (ordering) of the resultant files.  I.E. specifying
a FROM value to --numeric-suffixes should only impact the
start value, rather than the width.

In other words if you were to split 2 files into 200 parts like:
  split--number=l/100 input1.txt
  split --numeric-suffixes=100 --number=l/100 input2.txt
Then you really need to be specifying -a3 to set
the suffix length appropriately.

We might be able to give an earlier error in this case,
and we should probably clarify the info docs a bit more.
I'll think about it.

cheers,
Pádraig.





bug#20474: tr command

2015-04-30 Thread Pádraig Brady
tag 20474 notabug
close 20474
stop

On 30/04/15 17:31, Joseph Piette wrote:
 Hello:
 
 When transferring files from the Windows environment to the Linux environment 
 we execute a script to remove the \cr characters. The script performs a simple
 
 tr -d '\r'   input   output
 
 Recently we were testing with files that contained a string with a single 
 quote – “Paym’t”
 What the tr command is doing is not only removing the “\cr” characters but 
 also the single quote. What we ended up with was “Paymt”

I'm guessing that you're using a unibyte locale
and that your tr command is using curly quotes rather that single quotes.
That in turn is passed by the shell to tr which will then delete such curly 
quotes?

Pádraig.





bug#20450: coreutils cannot built with clang

2015-04-28 Thread Pádraig Brady
On 28/04/15 20:38, Yunlian Jiang wrote:
 Hi, 
 
When I try to use clang to build coreutils, I got some thing like
 
 src/coreutils.c:81:3:   AR   src/libsinglebin_printenv.a
 error: embedding a #include directive within macro arguments is not supported
 # include coreutils.h
   ^
 
 And I have the following ugly patch to make it work.

I can't reproduce with clang 3.5 on Fedora 22 here,
but yes this will be an issue anywhere printf is a macro.
I'll apply something like this fix in your name.

thanks!
Pádraig.






bug#20437: ls links too many dynamic libraries

2015-04-27 Thread Pádraig Brady
tag 20437 notabug
close 20437
stop

On 27/04/15 06:30, Paul Eggert wrote:
 Currently GNU 'ls' dynamically links a whole bunch of libraries, libraries 
 like 
 libpcre and liblzma.  Can we figure out some way to remove the runtime 
 dependencies on these libraries?  It's better if a core utility like 'ls' 
 avoids 
 libthis and libthat unless the libraries are vital to its function, which 
 these 
 shouldn't be.
 
 I installed the attached patches to get rid of one unnecessary library, 
 libacl, 
 on GNU/Linux.  Can we do better and get rid of more dependencies?  Perhaps 
 using 
 techniques similar to what was used to get rid of libacl?

As was discussed recently¹ with removing the libmount dependency for df,
these dependencies are coming from libselinux, and one of the primary
authors of libselinux was made aware of that issue, so I'm closing
this here.

coreutils linking with libselinux are:

 chcon cp dir install id ls mkdir mkfifo mknod mv runcon stat vdir

BTW I noticed ldd -v doesn't give complete info,
and that `readelf -d $(which ls) | grep NEEDED` is better.
This is wrapped in the lddot² visualizer, and I've attached
the output from `lddot git/coreutils/src/ls | graph-easy --as png`

cheers,
Pádraig.

¹ http://lists.gnu.org/archive/html/coreutils/2015-04/msg00011.html
² http://jwilk.net/software/lddot


bug#20437: ls links too many dynamic libraries

2015-04-27 Thread Pádraig Brady
On 27/04/15 14:12, Pádraig Brady wrote:
 tag 20437 notabug
 close 20437
 stop
 
 On 27/04/15 06:30, Paul Eggert wrote:
 Currently GNU 'ls' dynamically links a whole bunch of libraries, libraries 
 like 
 libpcre and liblzma.  Can we figure out some way to remove the runtime 
 dependencies on these libraries?  It's better if a core utility like 'ls' 
 avoids 
 libthis and libthat unless the libraries are vital to its function, which 
 these 
 shouldn't be.

 I installed the attached patches to get rid of one unnecessary library, 
 libacl, 
 on GNU/Linux.  Can we do better and get rid of more dependencies?  Perhaps 
 using 
 techniques similar to what was used to get rid of libacl?
 
 As was discussed recently¹ with removing the libmount dependency for df,
 these dependencies are coming from libselinux, and one of the primary
 authors of libselinux was made aware of that issue, so I'm closing
 this here.
 
 coreutils linking with libselinux are:
 
  chcon cp dir install id ls mkdir mkfifo mknod mv runcon stat vdir
 
 BTW I noticed ldd -v doesn't give complete info,
 and that `readelf -d $(which ls) | grep NEEDED` is better.
 This is wrapped in the lddot² visualizer, and I've attached
 the output from `lddot git/coreutils/src/ls | graph-easy --as png`
 
 cheers,
 Pádraig.
 
 ¹ http://lists.gnu.org/archive/html/coreutils/2015-04/msg00011.html
 ² http://jwilk.net/software/lddot

BTW I had a look at whether we could duplicate some getfilecon() logic
internally, but it doesn't seem practical due to mcstransd which
is a daemon that's used to translate the MCS / MLS internal policy
levels into user friendly labels.

I.E. we could duplicate the logic, and remove the lzma and pcre dependencies,
but it seems like too much to be duplicating. Further reductions in deps
might be possible by splitting up libselinux, allowing apps like ls
that just read contexts, to link with the appropriate lib.

cheers,
Pádraig.





bug#20442: bug+patch: du output misaligned on different terminals

2015-04-27 Thread Pádraig Brady
tag 20442 wontfix
close 20442
stop

On 27/04/15 20:11, L. A. Walsh wrote:
 
 
 
 This is a fix/work-around for (RFE#19849 (bug#19849) which was 
 about addingg options to expand tabs and/or set a tabsize
 for output from 'du' so output would line up as intended.
 
 Without that enhancement, the current output is messed
 up on terminals/consoles that don't use hard-coded-constant
 widths for tabs (like many or most of the Xterm  linux
 consoles).
 
 Adding the switches is more work than I want to chew
 off right now, but the misaligned output made for difficult
 reading (besides looking bad), especially w/a monospace font
 where it is clear that the columns were meant to lineup.
 So I threw together a quick patch against the current 
 git source (changes limited to 'du.c').
 
 If someone would look it over, try it, or such and apply it
 to the current coreutils source tree (it's in patch form
 against 'src/du.c') for some soon future release, (at least
 until such time as the above mentioned RFE can be addressed).
 
 123456789 123456789 123456789 123456789 123456789 123456789 123456789 
 123456789 
 The current du output (example from my tmp dir) on a
 term w/o hard-coded-constant expansion looks like:
 
 Ishtar:tools/coreutils/work/src /usr/bin/du /tmp/t*
 4 /tmp/t
 1160  /tmp/t1
 680 /tmp/t2
 4 /tmp/tab2.patch
 20  /tmp/tabs
 4 /tmp/tmpf
 4 /tmp/topcmds
 24  /tmp/topcmds-hlps
 24  /tmp/topcmds2
 8 /tmp/topcmds2.txt
 4 /tmp/tq1
 32  /tmp/tt
 32  /tmp/tt

In fairness, this is with the unusual case after running `tabs 2`

 *Without* the assumption of hard-coded or fixed tabs (using 
 a 8-spaces/tab as seems to be the implementors assumption /
 intention), the output columns, again, line-up vertically:
 
 Ishtar:tools/coreutils/work/src ./du /tmp/t*   
 4   /tmp/t
 1160/tmp/t1
 680 /tmp/t2
 4   /tmp/tab2.patch
 20  /tmp/tabs
 4   /tmp/tmpf
 4   /tmp/topcmds
 24  /tmp/topcmds-hlps
 24  /tmp/topcmds2
 8   /tmp/topcmds2.txt
 4   /tmp/tq1
 32  /tmp/tt
 
 
 While not addressing the RFE, at least the original output format
 should look the same on all terminals

Thanks for the patch, however the same could be achieved
more generally with external tools. For example numbers are
better for human consumption when right aligned, so you
could achieve both with:

  du | numfmt --format %10f

cheers,
Pádraig.





bug#20442: bug+patch: du output misaligned on different terminals

2015-04-27 Thread Pádraig Brady
On 28/04/15 01:13, Linda Walsh wrote:
 reopen 20442
 thanks
 ===
 
 Your more general case doesn't work:
 
 du -sh /tmp/t*|numfmt --format %10f
 numfmt: rejecting suffix in input: ‘4.0K’ (consider using --from)
 du -sh --time /tmp/t*|numfmt --format %10f
 numfmt: rejecting suffix in input: ‘4.0K’ (consider using --from)

You can use:

  du -h --time | numfmt --from=iec --to=iec --format %10f

Though more naturally you can use:

  du -B1 --time | numfmt --to=iec --format %10f

Since these are formatting for human consumption,
there are varying preferences etc, and so
(variants of) the above are appropriate for aliases,
or shell functions if accepting parameters.

 I usually use other arguments with 'du'.  Your external tool solution
 doesn't handle the general case of du's output.
 
 The point was to correct 'du's output, not find a *custom* solution
 to correct assumptions made by 'du'.
 
 Why would you reject something that fixes this problem?

There are backwards compatibility issues to consider.

cheers,
Pádraig.





bug#20438: coreutils 8.23 on OS X : Calling gcp -al fails on symbolic links

2015-04-27 Thread Pádraig Brady
On 27/04/15 12:34, Thomas Baigneres wrote:
 Hello,

 Consider the following directory hierarchy:

   $ ls -lR
   total 0
   drwxr-xr-x  4 user  group  136 Apr 24 13:51 source

   ./source:
   total 16
   -rw-r--r--  1 user  group  0 Apr 24 13:50 file.txt
   lrwxr-xr-x  1 user  group  8 Apr 24 13:51 symbolic-link-to-file - 
 file.txt


 Trying to copy the `source` directory with the options -a (archive) and -l
 (hard link files instead of copying) fails when using the 8.23 version
 of coreutils:

   $ gcp -al source destination
   gcp: cannot create hard link 'destination/symbolic-link-to-file' to
   'source/symbolic-link-to-file': Operation not supported

 Trying to do the same with the 8.22 version of coreutils works as
 expected :

   $ gcp -al source destination
   $ ls -lR
   total 0
   drwxr-xr-x  4 user  group  136 Apr 24 13:51 destination
   drwxr-xr-x  4 user  group  136 Apr 24 13:51 source

   ./destination:
   total 16
   -rw-r--r--  2 user  group  0 Apr 24 13:50 file.txt
   lrwxr-xr-x  1 user  group  8 Apr 24 13:51 symbolic-link-to-file - 
 file.txt

   ./source:
   total 16
   -rw-r--r--  2 user  group  0 Apr 24 13:50 file.txt
   lrwxr-xr-x  1 user  group  8 Apr 24 13:51 symbolic-link-to-file - 
 file.txt


 Searching the internet suggests that the culprit might be the `linkat()`
 function, which exists but does not work as expected by coreutils.


Yes coreutils 8.23 uses linkat() for that where available:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.22-23-g9654b67

BTW this issue on OS X 10.10 is caught by gnulib and coreutils tests.
I.E. `make check` may be being ignored on your build?

There was initial work to fix this at:
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=commit;h=c8e57ce5

That doesn't fix everything though I think,
and I'll need to look into a complete fix for Mac OS X = 10.10
(or revert to the previous coreutils behavior).

cheers,
Pádraig.





bug#20310: [COREUTILS 2/2] ls: Don't treat lack of acl support as an error

2015-04-20 Thread Pádraig Brady
On 12/04/15 15:37, Andreas Gruenbacher wrote:
 * src/ls.c (file_has_acl_cache): When a file system doesn't support
 acls, fail with errno set to ENOTSUP.
 (gobble_file): Don't treat lack of acl support as an error.
 ---
  src/ls.c | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)
 
 diff --git a/src/ls.c b/src/ls.c
 index b308dd3..884e042 100644
 --- a/src/ls.c
 +++ b/src/ls.c
 @@ -2866,7 +2866,7 @@ getfilecon_cache (char const *file, struct fileinfo *f, 
 bool deref)
  
  /* Cache file_has_acl failure, when it's trivial to do.
 Like file_has_acl, but when F's st_dev says it's on a file
 -   system lacking ACL support, return 0 with ENOTSUP immediately.  */
 +   system lacking ACL support, fail with ENOTSUP immediately.  */
  static int
  file_has_acl_cache (char const *file, struct fileinfo *f)
  {
 @@ -2877,14 +2877,11 @@ file_has_acl_cache (char const *file, struct fileinfo 
 *f)
if (f-stat.st_dev == unsupported_device)
  {
errno = ENOTSUP;
 -  return 0;
 +  return -1;
  }
  
 -  /* Zero errno so that we can distinguish between two 0-returning cases:
 - has-ACL-support, but only a default ACL and no ACL support. */
 -  errno = 0;
int n = file_has_acl (file, f-stat);
 -  if (n = 0  errno_unsupported (errno))
 +  if (n  0  errno_unsupported (errno))
  unsupported_device = f-stat.st_dev;
return n;
  }
 @@ -3076,7 +3073,7 @@ gobble_file (char const *name, enum filetype type, 
 ino_t inode,
if (err == 0  format == long_format)
  {
int n = file_has_acl_cache (absolute_name, f);
 -  err = (n  0);
 +  err = (n  0  ! errno_unsupported (errno));
have_acl = (0  n);
  }

I dislike this change actually.
Or more accurately, the gnulib change that changed the file_has_acl()
interface, requiring this change.

Previously in gnulib we mapped ENOTSUP to return 0 using:
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/acl-errno-valid.c

Since we've now changed file_has_acl() to return -1 in this case,
all gnulib users may now be printing erroneous errors etc.

Is there any reason not to use the same gnulib acl_errno_valid() logic
in the newly added avoiding libacl path?

thanks,
Pádraig.





bug#20366: Strange behaviour of df (GNU coreutils) 8.23

2015-04-19 Thread Pádraig Brady
On 18/04/15 11:44, Benjamin Beier wrote:
 Hello,
 
 I am running multiple Gentoo/Funtoo servers and yesterday I noticed the 
 output of 'df -a' on the Funtoo systems does not show any information 
 about the root partition, although this works fine on all my Gentoo 
 systems. I compared the installed versions and after a little 
 investigation with upgrades and downgrades on both systems i found out, 
 that the problem is caused by coreutils version 8.23 (Funtoo default), 
 while coreutils version 8.21 (Gentoo default) works fine.
 
 df -a on 8.21:
 rootfs   3997376 1947348   1823932  52% /
 /dev/root3997376 1947348   1823932  52% /
 (...)
 
 df -a on 8.23:
 rootfs -   - -- /
 /dev/root  -   - -- /
 (...)

  /var/lib/ntp   -   - -- 
/chroot/ntpd/var/lib/ntp
  /var/log/ntp 3997376 1947140   1824140  52% 
/chroot/ntpd/var/log/ntp

 I also straced df with both versions and interestingly both versions 
 seem to collect valid usage information about the root partition. While 
 8.21 is showing this information in the output, 8.23 just decides to 
 print dashes for some reason?
 
 I attached the strace and full output of both versions.

With the current assumptions in df, it's awkward for the 'Filesystem',
i.e. the first column, to keep changing, when in fact the corresponding
device ID returned by stat() stays the same.

Does /etc/mtab symlink to /proc/mounts on your system?
I suspect it doesn't and thus bind mounts are represented as above.
Does the issue go away if you temporarily replace /etc/mtab
with a symlink to /proc/mounts?

In the next release of df (after coreutils next syncs with gnulib),
/proc/self/mountinfo will be queried instead, and thus probably
avoiding this issue?

thanks,
Pádraig.





bug#20354: [feature request] ln with command line arguments in reverse order

2015-04-18 Thread Pádraig Brady
tag 20354 wontfix
close 20354
stop

On 18/04/15 07:09, Bernhard Voelker wrote:
 On 04/17/2015 04:52 PM, Erik Auerswald wrote:
 On Fri, Apr 17, 2015 at 01:45:02PM +0100, Pádraig Brady wrote:
 How I think about it is:

   cp [OPTION]  EXISTING NEW
   mv [OPTION]  EXISTING NEW
   ln [OPTIONS] EXISTING NEW

 That's good wording.
 
 IMO there's no gain if the operand names are the same, because
 then the users would have to know the tool even better.  Such
 distinction makes the users help to remember how the tool works.
 So at least for ln(1), the word LINK_NAME is perfect.
 
 FWIW this was Jim's change to improve the wording back in 1998:
 
 http://git.sv.gnu.org/cgit/coreutils.git/commit/?id=519365bb089c

I agree. NEW above is ambiguous for example,
as DEST can already exist. Also EXISTING in the ln
case is not accurate, since the target doesn't
need to exist.

I was just indicating how I summarise the usual
use case for these in my mind, but can't think of
any improvement to the more accurate existing wording.

cheers,
Pádraig.






bug#20354: [feature request] ln with command line arguments in reverse order

2015-04-17 Thread Pádraig Brady
On 17/04/15 12:45, Erik Auerswald wrote:
 On Fri, Apr 17, 2015 at 01:12:01PM +0200, Bernhard Voelker wrote:
 On 04/17/2015 10:39 AM, Ma Jiehong wrote:
 Currently, 'cp', 'mv' and 'ln' share the same basic syntax, that is to say 
 the following:

 cp [OPTION]  SOURCE DEST
 mv [OPTION] SOURCE DEST
 ln [OPTIONS] TARGET LINK_NAME

 Which is the same exact rule, and is consistent.
 [...]
 In this case, the command would act like this:
 ln --reverse-order LINK_NAME TARGET

 Adding an option to reverse the two may have it's merits, but I guess this
 extra flexibility would only confuse the users even more.
 
 If you do not know the original order beforehand, you do not know the
 --reverse-order either. IMHO this option does not help.
 
 The situation would be better if the target would be an operand to that
 option, similar to mv's --target-directory=DIRECTORY option.
 
 Careful here, --target-directory specifies a DESTination, while ln's TARGET
 means SOURCE.
 
 However, I think this would just bloat the code for not much new 
 functionality,
 and I'm convinced that a good translation for TARGET and LINK_NAME in --help
 output would be the better way.
 
 I'd say that using TARGET instead of SOURCE creates confusion that would be
 avoided by using SOURCE and DEST as with cp and mv.

Not really, as one could still consider that
DEST was the destination of a symlink.

How I think about it is:

  cp [OPTION]  EXISTING NEW
  mv [OPTION]  EXISTING NEW
  ln [OPTIONS] EXISTING NEW

cheers,
Pádraig.





bug#18748: cp doesn't behaves as mkdir and touch when a default acl exists

2015-04-13 Thread Pádraig Brady
On 13/04/15 00:10, Andreas Grünbacher wrote:
 When a file is copied with cp, the default is to create the new file
 in the target directory, with the file mode of the original file as
 the create mode. This default can be overridden with cp's -p or
 --preserve=mode options.
 
 This has the following effect:
 
  * In the absence of a default acl, the new file will have the original
file's permission bits minus the umask.
 
  * In the presence of a default acl, the default acl replaces the umask.
The new file will inherit the default acl, which results in an imaginary
file mode. The actual file mode is set to the intersection of the original
file's permission bits and this imaginary file mode.
 
 This is not a bug, it is the expected and documented behavior; see the
 POSIX standard, POSIX 1003.1e/2c draft standard [*], and also the
 coreutils info pages (info coreutils 'cp invocation') which could be
 improved.

Your improvement pull request is now pushed at
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=88e4910

thanks!
Pádraig.






bug#19760: [bug] tail -f with inotify fails to follow a file after a rename()

2015-03-31 Thread Pádraig Brady
On 31/03/15 07:30, Bernhard Voelker wrote:
 On 03/31/2015 05:15 AM, Pádraig Brady wrote:
 +  tail -f continues to follow changes to a file even after it's renamed.
 +  [bug introduced in coreutils-7.5]
 +
 
 It is not 100% clear to me by this sentence what was the actual change;
 maybe a little again or now would help?
 
 --- /dev/null
 +++ b/tests/tail-2/f-vs-rename.sh
 @@ -0,0 +1,51 @@
 +#!/bin/sh
 +# demonstrate that tail -f works when renaming the tailed files
 
 s/^d/D/; s/$/./
 
 +# Before coreutils-8.24, tail -f a would stop tracking additions to b
 +# after mv a b.
 +
 +# Copyright (C) 2015 Free Software Foundation, Inc.
 +
 +# This program is free software: you can redistribute it and/or modify
 +# it under the terms of the GNU General Public License as published by
 +# the Free Software Foundation, either version 3 of the License, or
 +# (at your option) any later version.
 +
 +# This program is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +
 +# You should have received a copy of the GNU General Public License
 +# along with this program.  If not, see http://www.gnu.org/licenses/.
 +
 +. ${srcdir=.}/tests/init.sh; path_prepend_ ./src
 +print_ver_ tail
 +
 +touch a || framework_failure_
 +
 +debug='---disable-inotify'
 +debug=
 +tail $debug -f -s.1 a  out 21  pid=$!
 
 Shouldn't $debug be removed?  Otherwise maybe a loop over both the
 inotify and the non-inotify mode would make sense?
 
 +
 +check_tail_output()
 +{
 +  local delay=$1
 +  grep $tail_re out  /dev/null ||
 +{ sleep $delay; return 1; }
 +}
 
 Please don't discard grep's output: reading the test's log file is
 easier with this included.
 
 +
 +# Wait up to 12.7s for tail to start
 
 s/$/./
 
 +echo x  a
 +tail_re='^x$' retry_delay_ check_tail_output .1 7 || fail=1
 +
 +mv a b || fail=1
 +
 +echo y  b
 +# Wait up to 12.7s for y to appear in the output:
 +tail_re='^y$' retry_delay_ check_tail_output .1 7 || fail=1
 +
 +kill $pid
 +
 +wait
 +
 +Exit $fail
 
 Otherwise +1 (including the changes in tail.c).

All good suggestions.
Latest attached.

thanks for the review!
Pádraig.

From d313a0b24234d3366ec263111469f219f5b4634f Mon Sep 17 00:00:00 2001
From: Stephane Chazelas stephane.chaze...@gmail.com
Date: Tue, 3 Feb 2015 21:22:06 +
Subject: [PATCH] tail: fix -f to follow changes after a rename

* src/tail.c (tail_forever_inotify): Only monitor write()s and
truncate()s to files in --follow=descriptor mode, thus avoiding
the bug where we removed the watch on renamed files.
Also adjust the inotify event processing code that is
now significant only in --follow=name mode.
* tests/tail-2/F-vs-rename.sh: Improve this existing test by running
in both polling and inotify modes.
* tests/tail-2/f-vs-rename.sh: A new test based on the existing one.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug.
Fixes http://bugs.gnu.org/19760
---
 NEWS|  3 ++
 src/tail.c  | 31 +++
 tests/local.mk  |  1 +
 tests/tail-2/F-vs-rename.sh | 94 +++--
 tests/tail-2/f-vs-rename.sh | 51 
 5 files changed, 117 insertions(+), 63 deletions(-)
 create mode 100755 tests/tail-2/f-vs-rename.sh

diff --git a/NEWS b/NEWS
index 81031c6..4b12e46 100644
--- a/NEWS
+++ b/NEWS
@@ -39,6 +39,9 @@ GNU coreutils NEWS-*- outline -*-
   resources with many files, or with -F if files were replaced many times.
   [bug introduced in coreutils-7.5]
 
+  tail -f again follows changes to a file after it's renamed.
+  [bug introduced in coreutils-7.5]
+
 ** New features
 
   chroot accepts the new --skip-chdir option to not change the working directory
diff --git a/src/tail.c b/src/tail.c
index c5380cb..f75d7a9 100644
--- a/src/tail.c
+++ b/src/tail.c
@@ -159,13 +159,6 @@ struct File_spec
   uintmax_t n_unchanged_stats;
 };
 
-#if HAVE_INOTIFY
-/* The events mask used with inotify on files.  This mask is not used on
-   directories.  */
-static const uint32_t inotify_wd_mask = (IN_MODIFY | IN_ATTRIB
- | IN_DELETE_SELF | IN_MOVE_SELF);
-#endif
-
 /* Keep trying to open a file even if it is inaccessible when tail starts
or if it becomes inaccessible later -- useful only with -f.  */
 static bool reopen_inaccessible_files;
@@ -1390,6 +1383,13 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files,
   if (! wd_to_name)
 xalloc_die ();
 
+  /* The events mask used with inotify on files (not directories).  */
+  uint32_t inotify_wd_mask = IN_MODIFY;
+  /* TODO: Perhaps monitor these events in Follow_descriptor mode also,
+ to tag reported file names with deleted, moved etc.  */
+  if (follow_mode == Follow_name)
+inotify_wd_mask |= (IN_ATTRIB | IN_DELETE_SELF

bug#19760: [bug] tail -f with inotify fails to follow a file after a rename()

2015-03-31 Thread Pádraig Brady
BTW given that -f was broken for so long (6 years)
it lends more weight to making -f behave like -F by default.

Note POSIX allows (and even implies this),
and openBSD -f behaves like -F for example.

Not something appropriate for coreutils 8.x,
and I'd be 60:40 against changing in a later major release,
but it's worth mentioning.

cheers,
Pádraig.





bug#20238: Can these Errors be fixed? Thank you.

2015-03-31 Thread Pádraig Brady
On 31/03/15 21:29, Dake Zhang wrote:

 FAIL: tests/misc/ls-time
 ++ ls -ut a b c
 + set b c a
 + test 'b c a' = 'c b a'
 + fail=1
 + test 1 = 1
 + ls -l --full-time --time=access a b c
 -rw-r--r-- 1 dakez BCM\Domain Users 0 1998-01-14 11:00:00.0 + a
 -rw-r--r-- 1 dakez BCM\Domain Users 0 2015-03-31 20:18:25.0 + b
 -rw-r--r-- 2 dakez BCM\Domain Users 0 2015-03-31 20:18:25.0 + c

Strange. The access time for b and c should have been set to 1998 like
it was for a.  Now we only verify that a's access time was set,
so we could be more robust and test all 3.  Does you know if there
might be any scanners etc. on your system accessing new files?
What file system is this?

 FAIL: tests/cp/link-symlink
 + touch file
 + ln -s file link
 + touch -m -h -d 2011-01-01 link
 + case $(stat --format=%y link) in
 ++ stat --format=%y link
 + cp -al link link.cp
 + case $(stat --format=%y link.cp) in
 ++ stat --format=%y link.cp
 + fail=1

The above is again a timestamp issue.
Now there are extra gating checks in tests/cp/link-deref.sh
which might be appropriate to add to this test,
though perhaps again there is something else interfering with timestamps?

What operating system/version is this exactly?

thanks,
Pádraig.





bug#19760: [bug] tail -f with inotify fails to follow a file after a rename()

2015-03-30 Thread Pádraig Brady
On 03/02/15 23:30, Pádraig Brady wrote:
 On 03/02/15 22:04, Stephane Chazelas wrote:
 Hello,

 On Linux, when inotify is used,

tail -f file

 follows a file only until it's renamed. After it is renamed, the
 inotify watch is removed, which means tail sits there doing
 nothing and any further modifications to the file are ignored.

 To reproduce:

 echo 1  file
 tail -f file 
 exec 3 file
 echo 2 3
 sleep 1
 mv file file2
 sleep 1
 echo 3 3
 sleep 1
 :  file2

 3 is not displayed. No message about the file being truncated
 either.

 Work arounds:

tail ---disable-inotify -f file
tail -f  file # effectively disables inotify

or rename the file with a link() followed by an unlink()
ln file newfile  rm -f file

 Note that the IN_DELETED_SELF event is not reached in
 follow-descriptor mode because tail has the file open preventing
 it from being deleted even after it's unlinked from the last
 directory.

 Patch attached (on the current git head).
 
 Ouch. The patch makes sense on first glance,
 and all existing tests pass with it.
 I'll check some more and add a test.

Sorry for the delay.
I'll apply the attached in your name soon.

thanks,
Pádraig.

From 9c23049e17a76f4ec8f38c04b088f149a49b4851 Mon Sep 17 00:00:00 2001
From: Stephane Chazelas stephane.chaze...@gmail.com
Date: Tue, 3 Feb 2015 21:22:06 +
Subject: [PATCH] tail: fix -f to follow changes after a rename

* src/tail.c (tail_forever_inotify): Only monitor write()s and
truncate()s to files in --follow=descriptor mode, thus avoiding
the bug where we removed the watch on renamed files.
Also adjust the inotify event processing code which
that is now significant only in --follow=name mode.
* tests/tail-2/f-vs-rename.sh: A new test.
* tests/local.mk: Reference the new test.
* NEWS: Mention the bug.
Fixes http://bugs.gnu.org/19760
---
 NEWS|  3 +++
 src/tail.c  | 31 +--
 tests/local.mk  |  1 +
 tests/tail-2/f-vs-rename.sh | 51 +
 4 files changed, 69 insertions(+), 17 deletions(-)
 create mode 100755 tests/tail-2/f-vs-rename.sh

diff --git a/NEWS b/NEWS
index 81031c6..214db08 100644
--- a/NEWS
+++ b/NEWS
@@ -39,6 +39,9 @@ GNU coreutils NEWS-*- outline -*-
   resources with many files, or with -F if files were replaced many times.
   [bug introduced in coreutils-7.5]
 
+  tail -f continues to follow changes to a file even after it's renamed.
+  [bug introduced in coreutils-7.5]
+
 ** New features
 
   chroot accepts the new --skip-chdir option to not change the working directory
diff --git a/src/tail.c b/src/tail.c
index c5380cb..f75d7a9 100644
--- a/src/tail.c
+++ b/src/tail.c
@@ -159,13 +159,6 @@ struct File_spec
   uintmax_t n_unchanged_stats;
 };
 
-#if HAVE_INOTIFY
-/* The events mask used with inotify on files.  This mask is not used on
-   directories.  */
-static const uint32_t inotify_wd_mask = (IN_MODIFY | IN_ATTRIB
- | IN_DELETE_SELF | IN_MOVE_SELF);
-#endif
-
 /* Keep trying to open a file even if it is inaccessible when tail starts
or if it becomes inaccessible later -- useful only with -f.  */
 static bool reopen_inaccessible_files;
@@ -1390,6 +1383,13 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files,
   if (! wd_to_name)
 xalloc_die ();
 
+  /* The events mask used with inotify on files (not directories).  */
+  uint32_t inotify_wd_mask = IN_MODIFY;
+  /* TODO: Perhaps monitor these events in Follow_descriptor mode also,
+ to tag reported file names with deleted, moved etc.  */
+  if (follow_mode == Follow_name)
+inotify_wd_mask |= (IN_ATTRIB | IN_DELETE_SELF | IN_MOVE_SELF);
+
   /* Add an inotify watch for each watched file.  If -F is specified then watch
  its parent directory too, in this way when they re-appear we can add them
  again to the watch list.  */
@@ -1641,20 +1641,17 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files,
 
   if (ev-mask  (IN_ATTRIB | IN_DELETE_SELF | IN_MOVE_SELF))
 {
-  /* For IN_DELETE_SELF, we always want to remove the watch.
- However, for IN_MOVE_SELF (the file we're watching has
- been clobbered via a rename), when tailing by NAME, we
- must continue to watch the file.  It's only when following
- by file descriptor that we must remove the watch.  */
-  if ((ev-mask  IN_DELETE_SELF)
-  || ((ev-mask  IN_MOVE_SELF)
-   follow_mode == Follow_descriptor))
+  /* Note for IN_MOVE_SELF (the file we're watching has
+ been clobbered via a rename) we leave the watch
+ in place since it may still be part of the set
+ of watched names.  */
+  if (ev-mask  IN_DELETE_SELF)
 {
   inotify_rm_watch (wd, fspec-wd

bug#20210: tests/df/skip-duplicates fails on Debian-kFreeBSD due to calling 'strstr(NULL, )'

2015-03-27 Thread Pádraig Brady
On 27/03/15 00:28, Assaf Gordon wrote:
 Hello,
 
 A somewhat exotic test failure:
 
 On Debian/kFreeBSD 'tests/df/skip-duplicates' fails with 'df' segfaulting 
 like so:
 
   ...
   ./tests/df/skip-duplicates.sh: line 113:  7741 Segmentation fault  
 LD_PRELOAD=./k.so df
   ...

 I'm not sure what is the correct,clean fix, attached are two options (one 
 fixes the test, one avoids the call in lib/mountlist.c).

Nice one. I'll apply the test fix to coreutils.

thanks!
Pádraig.






bug#20199: Enhancement request for date's -d option: different epochs

2015-03-25 Thread Pádraig Brady
On 25/03/15 12:26, Ulrich Windl wrote:
 Hi!
 
 I'm not subscribed to this list, and I hope this is the right place to report 
 an enhancement request as there seems to be no bugzilla for that.
 Anyway: When downloading the current leap seconds list for out NTP server I 
 realized that the dates there seem to be specified in seconds from 
 1900-01-01_00:00:00 on one hand, and on the other I realized that date's 
 option -d only allows UNIX epochs using the @ notation.
 
 Therefore I suggest to allow different starting epochs, possible using a 
 syntax like date -d '1900-01-01+2287785599' to print the date and time of 
 2287785599 seconds past January 1st 1900. (Like means I suggest the 
 semantics, but are not really proposing a concrete syntax; possibly there are 
 smarter guy around than me)
 
 Also being able to decode hexadecimal NTP timestamps would be a nice feature: 
 NTp timestamps look like this:
 d8bd24ef.a8e2bb68 meaning Wed, Mar 25 2015 13:13:35.659..., so it's 32 bit 
 for the seconds and another 32 bit for the fractional seconds (see page 9 of 
 the PostScript or PDF version of RFC 1305: NTP timestamps are represented as 
 a 64-bit
 unsigned fixed-point number, in seconds relative to 0h on 1 January 1900. The 
 integer part is in the
 first 32 bits and the fraction part in the last 32 bits.)
 
 Maybe a tagged syntax like -d NTP:d8bd24ef.a8e2bb6 could be used...
 
 (For consistency other tags like UNIX: for the UNIX epoch and MS-WIN: for 
 Microsoft Windows could be used. Again smart guys probably know more 
 important epochs than I do)

Note offsets can be negative, so you could set the offset like:

  $ sec=1234567890
  $ date -d @$(($sec + $(date -d 1/1/1900 +%s)))
  Tue Feb 14 23:56:51 GMT 1939

cheers,
Pádraig.





bug#20199: Enhancement request for date's -d option: different epochs

2015-03-25 Thread Pádraig Brady
On 25/03/15 13:10, Pádraig Brady wrote:
 On 25/03/15 12:26, Ulrich Windl wrote:
 Hi!

 I'm not subscribed to this list, and I hope this is the right place to 
 report an enhancement request as there seems to be no bugzilla for that.
 Anyway: When downloading the current leap seconds list for out NTP server I 
 realized that the dates there seem to be specified in seconds from 
 1900-01-01_00:00:00 on one hand, and on the other I realized that date's 
 option -d only allows UNIX epochs using the @ notation.

 Therefore I suggest to allow different starting epochs, possible using a 
 syntax like date -d '1900-01-01+2287785599' to print the date and time of 
 2287785599 seconds past January 1st 1900. (Like means I suggest the 
 semantics, but are not really proposing a concrete syntax; possibly there 
 are smarter guy around than me)

 Also being able to decode hexadecimal NTP timestamps would be a nice 
 feature: NTp timestamps look like this:
 d8bd24ef.a8e2bb68 meaning Wed, Mar 25 2015 13:13:35.659..., so it's 32 bit 
 for the seconds and another 32 bit for the fractional seconds (see page 9 of 
 the PostScript or PDF version of RFC 1305: NTP timestamps are represented 
 as a 64-bit
 unsigned fixed-point number, in seconds relative to 0h on 1 January 1900. 
 The integer part is in the
 first 32 bits and the fraction part in the last 32 bits.)

 Maybe a tagged syntax like -d NTP:d8bd24ef.a8e2bb6 could be used...

 (For consistency other tags like UNIX: for the UNIX epoch and MS-WIN: 
 for Microsoft Windows could be used. Again smart guys probably know more 
 important epochs than I do)
 
 Note offsets can be negative, so you could set the offset like:
 
   $ sec=1234567890
   $ date -d @$(($sec + $(date -d 1/1/1900 +%s)))
   Tue Feb 14 23:56:51 GMT 1939

Actually we can use date's relative date support for this:

  $ date -d '1900-01-01 +1234567890 seconds'
  Tue Feb 14 23:56:51 GMT 1939

cheers,
Pádraig.





bug#20203: add example to date man page

2015-03-25 Thread Pádraig Brady
On 25/03/15 23:58, 積丹尼 Dan Jacobson wrote:
 Please change
 
-I[TIMESPEC], --iso-8601[=TIMESPEC]
   output date/time in ISO 8601 format.  TIMESPEC='date'  for  date
   only  (the  default), 'hours', 'minutes', 'seconds', or 'ns' for
   date and time to the indicated precision.
 
 to
 -I[TIMESPEC], --iso-8601[=TIMESPEC]
   output date/time in ISO 8601 format.  TIMESPEC='date'  for  date
   only  (2015-03-26, the  default), 'hours', 'minutes', 
 'seconds', or 'ns' for
   date and time to the indicated precision.

Probably not worth doing on its own, but I noticed other
improvements that could be made to these option descriptions,
which the attached should fix up.

cheers,
Pádraig.

From a5dc4b8d697e7a50bec123113950ab3f5700705a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Thu, 26 Mar 2015 00:47:28 +
Subject: [PATCH] doc: clarify the date standard output formats

* src/date.c (usage): Use FMT rather than TIMESPEC as the parameter,
since it's simpler to understand and can be better aligned.
Give an example for the --iso-8601 output format.
Adjust the example used for the 3 standard formats to
be unambiguous to day/mon ordering and use of leading zeros in the time.
Reorder the options descriptions slightly, so that the
3 standards options are together.
Fixes http://bugs.gnu.org/20203
---
 src/date.c | 38 ++
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/src/date.c b/src/date.c
index 65fd0fc..3df7522 100644
--- a/src/date.c
+++ b/src/date.c
@@ -132,26 +132,32 @@ Display the current time in the given FORMAT, or set the system date.\n\
   emit_mandatory_arg_note ();
 
   fputs (_(\
-  -d, --date=STRING display time described by STRING, not 'now'\n\
-  -f, --file=DATEFILE   like --date once for each line of DATEFILE\n\
-  -I[TIMESPEC], --iso-8601[=TIMESPEC]  output date/time in ISO 8601 format.\n\
-TIMESPEC='date' for date only (the default),\n\
-'hours', 'minutes', 'seconds', or 'ns' for date\n\
-and time to the indicated precision.\n\
+  -d, --date=STRING  display time described by STRING, not 'now'\n\
+  -f, --file=DATEFILElike --date; once for each line of DATEFILE\n\
 ), stdout);
   fputs (_(\
-  -r, --reference=FILE  display the last modification time of FILE\n\
-  -R, --rfc-2822output date and time in RFC 2822 format.\n\
-Example: Mon, 07 Aug 2006 12:34:56 -0600\n\
+  -I[FMT], --iso-8601[=FMT]  output date/time in ISO 8601 format.\n\
+ FMT='date' for date only (the default),\n\
+ 'hours', 'minutes', 'seconds', or 'ns' for date\n\
+ and time to the indicated precision.\n\
+ Example: 2006-08-14T02:34:56-0600\n\
 ), stdout);
   fputs (_(\
-  --rfc-3339=TIMESPEC   output date and time in RFC 3339 format.\n\
-TIMESPEC='date', 'seconds', or 'ns' for\n\
-date and time to the indicated precision.\n\
-Date and time components are separated by\n\
-a single space: 2006-08-07 12:34:56-06:00\n\
-  -s, --set=STRING  set time described by STRING\n\
-  -u, --utc, --universalprint or set Coordinated Universal Time (UTC)\n\
+  -R, --rfc-2822 output date and time in RFC 2822 format.\n\
+ Example: Mon, 14 Aug 2006 02:34:56 -0600\n\
+), stdout);
+  fputs (_(\
+  --rfc-3339=FMT output date/time in RFC 3339 format.\n\
+ FMT='date', 'seconds', or 'ns' for\n\
+ date and time to the indicated precision.\n\
+ Example: 2006-08-14 02:34:56-06:00\n\
+), stdout);
+  fputs (_(\
+  -r, --reference=FILE   display the last modification time of FILE\n\
+), stdout);
+  fputs (_(\
+  -s, --set=STRING   set time described by STRING\n\
+  -u, --utc, --universal print or set Coordinated Universal Time (UTC)\n\
 ), stdout);
   fputs (HELP_OPTION_DESCRIPTION, stdout);
   fputs (VERSION_OPTION_DESCRIPTION, stdout);
-- 
2.1.0



bug#20172: make each ls sort option say what is first on the man page

2015-03-23 Thread Pádraig Brady
On 23/03/15 00:37, 積丹尼 Dan Jacobson wrote:
 Please make
 $ man ls|grep -e -[St].*sort
-S sort by file size
-t sort by modification time, newest first
 say
 $ man ls|grep -e -[St].*sort
-S sort by file size, largest first
-t sort by modification time, newest first
 even though yes, documented elsewhere.
 There may be others too...

Makes sense, and is similar to:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.12-31-g50ca38e

I've pushed this in your name:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=01fb984

thanks!
Pádraig.





bug#20166: LC_ALL=C

2015-03-22 Thread Pádraig Brady
tag 20166 notabug
close 20166
stop

On 22/03/15 15:52, Ralf Richter wrote:
 LC_ALL=C
 
 
 Dear guys,
 
 I use Debian 7.8.0. ls -g does not work. I have to write g out. ls 
 --group-directories-first works fine.
 
 Please be so kind and fix that.

-g is not the same as --g

If there is some other issue then we can reopen this.

thanks,
Pádraig.






bug#20120: wc output padding differs when - is in the file list

2015-03-19 Thread Pádraig Brady
On 18/03/15 17:54, Bernhard Voelker wrote:
 On 03/16/2015 06:42 AM, Eric Mrak wrote:
 It seems that whenever STDIN is involved the results padding
 reverts to the BSD-style 7/8 padding. When files are given
 as input (excluding STDIN) the padding reflects the width of
 the largest count.
 When files are given as input and one of these is -, the
 padding reverts again to the BSD 7/8 padding.
 
 Thanks for the report.
 This effect is there at least since the last bigger change in
 this area, the introduction of the function compute_number_width(),
 back in 2003.
 
 Furthermore, strange formatting also happend in other cases,
 e.g. for other non-regular files ...
 
$ wc /etc/hosts /dev/null
 41 1241355 /etc/hosts
  0   0   0 /dev/null
 41 1241355 total
 
 ... or where stat() returns a wrong value like for /proc files ...
 
$ wc /proc/cpuinfo x
52 256 1276 /proc/cpuinfo
1 0 1 x
53 256 1277 total
 
 ... or with the --files0-from=FILE option:
 
$ printf '%s\0' x /etc/hosts | wc --files0=-
1 0 1 x
41 124 1355 /etc/hosts
42 124 1356 total
 
 The number width is determined before reading the actual files.
 I'm asking myself if it would hurt to save the values for all files
 until all of them are read, and then do the calculation of the
 number width and the printing of all values.
 OTOH this would delay output until all files are read (besides
 the memory footprint).
 Any opinions if a proper output format warrants this disadvantages?

Changing to unbounded memory (albeit slowly increasing) is not worth it I think.

 Regarding the number width fallback of %7d: this is mentioned in
 the POSIX specification (in 'Rationale'), but I'm unsure if it's
 mandated/ recommended/deprecated behavior.
 http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wc.html

The existing padding is only a nicety.
If I was going to change anything, I'd add field selection and left/right
alignment support to column(1).

cheers,
Pádraig.





bug#20130: GNU test behaviour

2015-03-18 Thread Pádraig Brady
tag 20130 notabug
close 20130
stop

On 17/03/15 22:27, Paul Eggert wrote:
 On 03/17/2015 02:23 PM, Robson Júnior wrote:
 `test -e` with no filename being passed to. It returns 0, although it 
 should be 1. 
 
 No, 'test -e' should exit with status 0, because '-e' is a nonempty 
 string.  In general, 'test X' exits with status 0 if and only if X is 
 nonempty.  POSIX requires this behavior; see:
 
 http://pubs.opengroup.org/onlinepubs/9699919799/utilities/test.html
 
 and search for 1 argument.

If the filename is in a shell variable,
this is another reason for quoting.
I.E. this will work reliably:

  file=blah
  test -e $file || echo missing

thanks,
Pádraig.






bug#20127: bug in stty

2015-03-18 Thread Pádraig Brady
tag 20127 notabug
close 20127
stop

On 17/03/15 19:33, Morris Keesan wrote:
 stty --version:
 stty (GNU coreutils) 8.13
 
 uname -a:
 Linux ice4 3.2.0-75-generic #110-Ubuntu SMP Tue Dec 16 19:11:55 UTC 
 2014 x86_64 x86_64 x86_64 GNU/Linux
 ==
 bug:
stty erase ''
 sets erase to undef.  This may be intentional, but seems like a bug, since
 this behavior is undocumented, and the stty man page says,
 special values ^- or undef used to disable special characters.
 
 I haven't bothered testing, but I assume that an empty argument gets treated
 as undef for all special characters.

Yes, that's the same for all special characters.
Also Solaris and FreeBSD have the same behavior.
However I'm not sure it should be documented,
as the NUL character is being taken literally in this case,
and it happens to match the _POSIX_VDISABLE character.
This may change in future, so I would stick to the
documented interfaces.

thanks,
Pádraig.





bug#20114: tr does not support multibyte characters in the first argument

2015-03-16 Thread Pádraig Brady
On 16/03/15 02:30, Bruno Haible wrote:
 POSIX [1] specifies that the recognition of characters in 'tr' depends on
 the environment variables LANG, etc.
 
 But trying to replace a multibyte character by another character does not
 work:
 
 $ echo $LANG
 de_DE.UTF-8
 $ enspace=`printf '\u2002'`
 $ echo -n X${enspace}Y | tr ${enspace} ' ' | od -t x1
 000 58 20 20 20 59
 005
 
 Expected output would be:
 $ echo -n X${enspace}Y | tr ${enspace} ' ' | od -t x1
 000 58 20 59
 003
 
 With 'sed' it works:
 
 $ echo -n X${enspace}Y | sed -e s/${enspace}/ /g | od -t x1
 000 58 20 59
 003
 
 Bruno
 
 [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/tr.html

Yes you're right Bruno.
Multi-byte support in coreutils in general has languished,
but we hope to start improving support in the next major release (9?)
after the current imminent 8.24 stable release.

To that end I've put together a plan:
http://www.pixelbeat.org/docs/coreutils_i18n/

cheers,
Pádraig.





bug#20094: cp --dry-run

2015-03-12 Thread Pádraig Brady
On 12/03/15 02:27, 積丹尼 Dan Jacobson wrote:
 Proposal:
 add a new cp --dry-run
 Reason: we want to know what the verbose output of e.g.,
 $ cp -i -av /mnt/usb/thumb/ tmp/androidBackup/SDCard
 will be without actually running the command first.
 We are about to commit to a megabyte copy and we want to be sure where
 all those files are exactly going even after understanding all the
 documentation, and without needing to do partial wet runs etc. etc. etc. 
 etc.
 
 Also add a mv --dry-run.
 
 Please also document both would normally be used with -v.
 
 In fact cpio (mainly -p) needs a --dry-run too.
 
 Sorry if I have proposed this before, but it's an idea you just can't beat!

I never needed such an option myself.
It couldn't check the space or permissions of the dest,
so wouldn't be that useful. Perhaps cp --attributes-only
would help for your use case?

thanks,
Pádraig






bug#20091: mv command

2015-03-11 Thread Pádraig Brady
On 11/03/15 21:02, Rogers, Charles (MAN-Corporate-CON) wrote:
 Is it ever possible for the mv command  ( without using the –u option )  to
 
 leave the  file(s)  in the source directory, while also copying to the
 
 destination directory?
 
  
 
 We were experiencing this under  zsh   and  GNU/Linux  
 2.6.32-358.18.1.el6.x86_64.
 
  
 
 Any  comments appreciated!

this is mv --version 8.4 right?
How reproducible is this?
Could you send an strace from this smallest reproducer?

thanks,
Pádraig.






bug#20029: 'yes' surprisingly slow

2015-03-10 Thread Pádraig Brady
On 10/03/15 21:18, Ole Tange wrote:
 On Tue, Mar 10, 2015 at 1:31 AM, Giuseppe Scrivano gscriv...@gnu.org wrote:
 
 $ time src/yes `echo {1..2000}` | head -c 2000M | md5sum
 55c293324aa6ecce14f0bf30da5a4686  -

 real0m7.994s
 user0m11.093s
 sys 0m2.953s

 versus (with the patch):

 $ time src/yes `echo {1..2000}` | head -c 2000M | md5sum
 55c293324aa6ecce14f0bf30da5a4686  -

 real0m3.534s
 user0m4.164s
 sys 0m1.803s
 
 Are you sure your are not limited by md5sum?
 
 $ time yes `echo {1..2000}` | head -c 2000M /dev/null
 
 real0m0.660s
 user0m0.180s
 sys 0m1.115s
 
 The solution should perform no worse than that.

Two separate cases. Yours above is a single large argument.
Giuseppe's example above, and your original problematic example
were without quotes, and so with many arguments passed to yes.
I've applied Giuseppe's patch with an augmented test.

cheers,
Pádraig.






bug#20029: 'yes' surprisingly slow

2015-03-09 Thread Pádraig Brady
On 07/03/15 12:10, Pádraig Brady wrote:
 On 07/03/15 11:49, Ole Tange wrote:
 These two commands give the same output:

 $ yes `echo {1..1000}` | head -c 2300M | md5sum
 a0241f2247e9a37db60e7def3e4f7038  -

 $ yes `echo {1..1000}` | head -c 2300M | md5sum
 a0241f2247e9a37db60e7def3e4f7038  -

 But the time to run is quite different:

 $ time yes `echo {1..1000}` | head -c 2300M /dev/null

 real0m0.897s
 user0m0.384s
 sys 0m1.343s

 $ time yes `echo {1..1000}` | head -c 2300M /dev/null

 real0m11.352s
 user0m10.571s
 sys 0m2.590s

 WTF?!

 I imagine 'yes' spends a lot of time collecting the 1000 args. But why
 does it do that more than once?
 
 The stdio interactions dominate here.
 The slow case has 1000 times more fputs_unlocked() calls.
 Yes we could build the line up once and output that.
 If doing that we could also build up a BUFSIZ of complete lines
 to output at a time, in which case you'd probably avoid stdio altogether.

The attached should make things more efficient here.

thanks,
Pádraig.

From 7959bbf19307705e98f08cfa32a9dcf67672590c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Mon, 9 Mar 2015 19:27:32 +
Subject: [PATCH] yes: output data more efficiently

yes(1) may be used to generate repeating patterns of text
for test inputs etc., so adjust to be more efficient.

Profiling the case where yes(1) is outputting small items
through stdio (which was the default case), shows the overhead
continuously processing small items in main() and in stdio:

$ yes /dev/null  perf top -p $!
31.02%  yes   [.] main
27.36%  libc-2.20.so  [.] _IO_file_xsputn@@GLIBC_2.2.5
14.51%  libc-2.20.so  [.] fputs_unlocked
13.50%  libc-2.20.so  [.] strlen
10.66%  libc-2.20.so  [.] __GI___mempcpy
 1.98%  yes   [.] fputs_unlocked@plta

Sending more data per stdio call improves the situation,
but still, there is significant stdio overhead due to memory copies,
and the repeated string length checking:

$ yes `echo {1..1000}` /dev/null  perf top -p $!
42.26%  libc-2.20.so  [.] __GI___mempcpy
17.38%  libc-2.20.so  [.] strlen
 5.21%  [kernel]  [k] __srcu_read_lock
 4.58%  [kernel]  [k] __srcu_read_unlock
 4.27%  libc-2.20.so  [.] _IO_file_xsputn@@GLIBC_2.2.5
 2.50%  libc-2.20.so  [.] __GI___libc_write
 2.45%  [kernel]  [k] system_call
 2.40%  [kernel]  [k] system_call_after_swapgs
 2.27%  [kernel]  [k] vfs_write
 2.09%  libc-2.20.so  [.] _IO_do_write@@GLIBC_2.2.5
 2.01%  [kernel]  [k] fsnotify
 1.95%  libc-2.20.so  [.] _IO_file_write@@GLIBC_2.2.5
 1.44%  yes   [.] main

We can avoid all stdio overhead by building up the buffer
_once_ and outputting that, and the profile below shows
the bottleneck moved to the kernel:

$ src/yes /dev/null  perf top -p $!
15.42%  [kernel]  [k] __srcu_read_lock
12.98%  [kernel]  [k] __srcu_read_unlock
 9.41%  libc-2.20.so  [.] __GI___libc_write
 9.11%  [kernel]  [k] vfs_write
 8.35%  [kernel]  [k] fsnotify
 8.02%  [kernel]  [k] system_call
 5.84%  [kernel]  [k] system_call_after_swapgs
 4.54%  [kernel]  [k] __fget_light
 3.98%  [kernel]  [k] sys_write
 3.65%  [kernel]  [k] selinux_file_permission
 3.44%  [kernel]  [k] rw_verify_area
 2.94%  [kernel]  [k] __fsnotify_parent
 2.76%  [kernel]  [k] security_file_permission
 2.39%  yes   [.] main
 2.17%  [kernel]  [k] __fdget_pos
 2.13%  [kernel]  [k] sysret_check
 0.81%  [kernel]  [k] write_null
 0.36%  yes   [.] write@plt

Note this change also ensures that yes will only write complete lines
for lines softer than BUFSIZ.

* src/yes.c (main): Build up a BUFSIZ buffer of lines,
and output that, rather than having stdio process each item.
* tests/misc/yes.sh: Add a new test for various buffer sizes.
* tests/local.mk: Reference the new test.
Fixes http://bugs.gnu.org/20029
---
 src/yes.c | 43 +--
 tests/local.mk|  1 +
 tests/misc/yes.sh | 28 
 3 files changed, 70 insertions(+), 2 deletions(-)
 create mode 100755 tests/misc/yes.sh

diff --git a/src/yes.c b/src/yes.c
index b35b13f..91dea11 100644
--- a/src/yes.c
+++ b/src/yes.c
@@ -58,6 +58,10 @@ Repeatedly output a line with all specified STRING(s), or 'y'.\n\
 int
 main (int argc, char **argv)
 {
+  char buf[BUFSIZ];
+  char *pbuf = buf;
+  int i;
+
   initialize_main (argc, argv);
   set_program_name (argv[0]);
   setlocale (LC_ALL, );
@@ -77,9 +81,44 @@ main (int argc, char **argv)
   argv[argc++] = bad_cast (y);
 }
 
-  while (true)
+  /* Buffer data locally once, rather than having the
+ large overhead of stdio buffering each item.   */
+  for (i = optind; i  argc; i++)
+{
+  size_t len = strlen (argv[i]);
+  if (BUFSIZ  len || BUFSIZ - len = pbuf

bug#20029: 'yes' surprisingly slow

2015-03-09 Thread Pádraig Brady
On 09/03/15 20:02, Eric Blake wrote:
 On 03/09/2015 01:47 PM, Pádraig Brady wrote:
 

 Note this change also ensures that yes will only write complete lines
 for lines softer than BUFSIZ.
 
 s/softer/smaller/
 

 * src/yes.c (main): Build up a BUFSIZ buffer of lines,
 and output that, rather than having stdio process each item.
 * tests/misc/yes.sh: Add a new test for various buffer sizes.
 * tests/local.mk: Reference the new test.
 Fixes http://bugs.gnu.org/20029
 ---
  src/yes.c | 43 +--
  tests/local.mk|  1 +
  tests/misc/yes.sh | 28 
  3 files changed, 70 insertions(+), 2 deletions(-)
  create mode 100755 tests/misc/yes.sh

 diff --git a/src/yes.c b/src/yes.c
 index b35b13f..91dea11 100644
 --- a/src/yes.c
 +++ b/src/yes.c
 @@ -58,6 +58,10 @@ Repeatedly output a line with all specified STRING(s), or 
 'y'.\n\
  int
  main (int argc, char **argv)
  {
 +  char buf[BUFSIZ];
 
 Do you really want this stack-allocated?  BUFSIZ can be larger than a
 page, which can then interfere with stack overflow detection.

Well we do such stack buffers elsewhere:

$ git grep char.*\\[BUFSIZ\\]
src/head.c:  char buf[BUFSIZ];
src/head.c:char buffer[BUFSIZ];
src/head.c:  char buffer[BUFSIZ];
src/head.c:  char buffer[BUFSIZ];
src/head.c:  char buffer[BUFSIZ];
src/ls.c:  char smallbuf[BUFSIZ];
src/od.c:  char buf[BUFSIZ];
src/tail.c:  char buffer[BUFSIZ];
src/tail.c:  char buffer[BUFSIZ];
src/tail.c:char buffer[BUFSIZ];
src/tail.c:char buffer[BUFSIZ];
src/tail.c:  char buffer[BUFSIZ];
src/tail.c:  char buffer[BUFSIZ];
src/tee.c:  char buffer[BUFSIZ];
src/tr.c:static char io_buf[BUFSIZ];
src/uptime.c:  char buf[BUFSIZ];
src/yes.c:  char buf[BUFSIZ];

We would probably change them all if this was thought to be a problem.

 +. ${srcdir=.}/tests/init.sh; path_prepend_ ./src
 +print_ver_ yes
 +
 +for size in 1 4095 4096 8191 8192 16383 16384; do
 
 Should you also test 4097 8193 and 16385 (that is, a likely
 1-more-than-BUFSIZ in the mix)?

The 1 more is implicit with the \n added by yes(1).

thanks for the review!
Pádraig






bug#20029: 'yes' surprisingly slow

2015-03-09 Thread Pádraig Brady
On 10/03/15 00:31, Giuseppe Scrivano wrote:
 Pádraig Brady p...@draigbrady.com writes:
 
 The attached should make things more efficient here.

 thanks,
 Pádraig.


 From 7959bbf19307705e98f08cfa32a9dcf67672590c Mon Sep 17 00:00:00 2001
 From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
 Date: Mon, 9 Mar 2015 19:27:32 +
 Subject: [PATCH] yes: output data more efficiently

 yes(1) may be used to generate repeating patterns of text
 for test inputs etc., so adjust to be more efficient.

 Profiling the case where yes(1) is outputting small items
 through stdio (which was the default case), shows the overhead
 continuously processing small items in main() and in stdio:

 $ yes /dev/null  perf top -p $!
 31.02%  yes   [.] main
 27.36%  libc-2.20.so  [.] _IO_file_xsputn@@GLIBC_2.2.5
 14.51%  libc-2.20.so  [.] fputs_unlocked
 13.50%  libc-2.20.so  [.] strlen
 10.66%  libc-2.20.so  [.] __GI___mempcpy
  1.98%  yes   [.] fputs_unlocked@plta

 Sending more data per stdio call improves the situation,
 but still, there is significant stdio overhead due to memory copies,
 and the repeated string length checking:

 $ yes `echo {1..1000}` /dev/null  perf top -p $!
 42.26%  libc-2.20.so  [.] __GI___mempcpy
 17.38%  libc-2.20.so  [.] strlen
  5.21%  [kernel]  [k] __srcu_read_lock
  4.58%  [kernel]  [k] __srcu_read_unlock
  4.27%  libc-2.20.so  [.] _IO_file_xsputn@@GLIBC_2.2.5
  2.50%  libc-2.20.so  [.] __GI___libc_write
  2.45%  [kernel]  [k] system_call
  2.40%  [kernel]  [k] system_call_after_swapgs
  2.27%  [kernel]  [k] vfs_write
  2.09%  libc-2.20.so  [.] _IO_do_write@@GLIBC_2.2.5
  2.01%  [kernel]  [k] fsnotify
  1.95%  libc-2.20.so  [.] _IO_file_write@@GLIBC_2.2.5
  1.44%  yes   [.] main

 We can avoid all stdio overhead by building up the buffer
 _once_ and outputting that, and the profile below shows
 the bottleneck moved to the kernel:

 $ src/yes /dev/null  perf top -p $!
 15.42%  [kernel]  [k] __srcu_read_lock
 12.98%  [kernel]  [k] __srcu_read_unlock
  9.41%  libc-2.20.so  [.] __GI___libc_write
  9.11%  [kernel]  [k] vfs_write
  8.35%  [kernel]  [k] fsnotify
  8.02%  [kernel]  [k] system_call
  5.84%  [kernel]  [k] system_call_after_swapgs
  4.54%  [kernel]  [k] __fget_light
  3.98%  [kernel]  [k] sys_write
  3.65%  [kernel]  [k] selinux_file_permission
  3.44%  [kernel]  [k] rw_verify_area
  2.94%  [kernel]  [k] __fsnotify_parent
  2.76%  [kernel]  [k] security_file_permission
  2.39%  yes   [.] main
  2.17%  [kernel]  [k] __fdget_pos
  2.13%  [kernel]  [k] sysret_check
  0.81%  [kernel]  [k] write_null
  0.36%  yes   [.] write@plt

 Note this change also ensures that yes will only write complete lines
 for lines softer than BUFSIZ.

 * src/yes.c (main): Build up a BUFSIZ buffer of lines,
 and output that, rather than having stdio process each item.
 * tests/misc/yes.sh: Add a new test for various buffer sizes.
 * tests/local.mk: Reference the new test.
 Fixes http://bugs.gnu.org/20029
 ---
  src/yes.c | 43 +--
  tests/local.mk|  1 +
  tests/misc/yes.sh | 28 
  3 files changed, 70 insertions(+), 2 deletions(-)
  create mode 100755 tests/misc/yes.sh

 diff --git a/src/yes.c b/src/yes.c
 index b35b13f..91dea11 100644
 --- a/src/yes.c
 +++ b/src/yes.c
 @@ -58,6 +58,10 @@ Repeatedly output a line with all specified STRING(s), or 
 'y'.\n\
  int
  main (int argc, char **argv)
  {
 +  char buf[BUFSIZ];
 +  char *pbuf = buf;
 +  int i;
 +
initialize_main (argc, argv);
set_program_name (argv[0]);
setlocale (LC_ALL, );
 @@ -77,9 +81,44 @@ main (int argc, char **argv)
argv[argc++] = bad_cast (y);
  }
  
 -  while (true)
 +  /* Buffer data locally once, rather than having the
 + large overhead of stdio buffering each item.   */
 +  for (i = optind; i  argc; i++)
 +{
 +  size_t len = strlen (argv[i]);
 +  if (BUFSIZ  len || BUFSIZ - len = pbuf - buf)
 +break;
 +  memcpy (pbuf, argv[i], len);
 +  pbuf += len;
 +  *pbuf++ = i == argc - 1 ? '\n' : ' ';
 +}
 +  if (i  argc)
 +pbuf = NULL;
 
 since the buffer is partly filled, wouldn't be better to reuse it?
 
 Something like this (barely tested):
 
 diff --git a/src/yes.c b/src/yes.c
 index 91dea11..ac690ce 100644
 --- a/src/yes.c
 +++ b/src/yes.c
 @@ -92,9 +92,7 @@ main (int argc, char **argv)
pbuf += len;
*pbuf++ = i == argc - 1 ? '\n' : ' ';
  }
 -  if (i  argc)
 -pbuf = NULL;
 -  else
 +  if (i == argc)
  {
size_t line_len = pbuf - buf;
size_t lines = BUFSIZ / line_len;
 @@ -106,7 +104,7 @@ main (int argc, char **argv)
  }
  
/* The normal case is to continuously output the local

bug#20060: dummy-man broken

2015-03-08 Thread Pádraig Brady
On 08/03/15 10:38, Marcus Brinkmann wrote:
 Hi,
 
 just a short notice that dummy-man does not work anymore (found this
 while trying to bootstrap a new architecture):
 
 ./man/dummy-man: too many non-option arguments

Was this with v8.23 or with git?
There is this related recent fix already in git:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.23-37-gd2bcb04






bug#20035: GNU coreutils 8.4 October 2014 TR(1)

2015-03-08 Thread Pádraig Brady
On 08/03/15 03:11, James Geller wrote:
 Hi,
 
 This is not a bug report.
 
 I just wanted to say, it would be SO nice if you could include 3 or 45 
 examples in the man pages.
 
 It would have made my life easier.
 It took me a while and a Google search to get it to work.
 
 Please write documentation for people who need documentation.
 Don't write documentation for people who don't need documentation.

The man pages are a trade-off between brevity and completeness.
The full docs have examples: http://www.gnu.org/s/coreutils/tr
The next version of the man pages will link directly to the above.
Hence I'm closing this.

thanks,
Pádraig.






bug#20029: 'yes' surprisingly slow

2015-03-07 Thread Pádraig Brady
On 07/03/15 11:49, Ole Tange wrote:
 These two commands give the same output:
 
 $ yes `echo {1..1000}` | head -c 2300M | md5sum
 a0241f2247e9a37db60e7def3e4f7038  -
 
 $ yes `echo {1..1000}` | head -c 2300M | md5sum
 a0241f2247e9a37db60e7def3e4f7038  -
 
 But the time to run is quite different:
 
 $ time yes `echo {1..1000}` | head -c 2300M /dev/null
 
 real0m0.897s
 user0m0.384s
 sys 0m1.343s
 
 $ time yes `echo {1..1000}` | head -c 2300M /dev/null
 
 real0m11.352s
 user0m10.571s
 sys 0m2.590s
 
 WTF?!
 
 I imagine 'yes' spends a lot of time collecting the 1000 args. But why
 does it do that more than once?

The stdio interactions dominate here.
The slow case has 1000 times more fputs_unlocked() calls.
Yes we could build the line up once and output that.
If doing that we could also build up a BUFSIZ of complete lines
to output at a time, in which case you'd probably avoid stdio altogether.

BTW I noticed tee uses stdio calls which is redundant overhead currently.
It wouldn't if we added a --buffered call to tee so that it might
honor stdbuf(1), though I'm not sure it's worth that flexibility in tee.

I'll look at improving these.

thanks,
Pádraig.





bug#19992: Small mistake in comment in source code of ls

2015-03-04 Thread Pádraig Brady
On 03/03/15 21:40, Jarosław Gruca wrote:
 In the source code of ls (file src/ls.c):
 
 { 0, NULL },   /* ec: End color (replaces lc+no+rc) */
 
 there is a small mistake in the comment.
 
 In the place of 'lc+no+rc' should be 'lc+rs+rc' ('rs' instead of 'no').
 Each file is written as 'lc+colorcode+rc+filename+ec', but if
 the 'ec' code is undefined, the sequence 'lc+rs+rc' (with 'rs'),
 and not 'lc+no+rc' (with 'no'), is used:
 
 static void
 prep_non_filename_text (void)
 {
   if (color_indicator[C_END].string != NULL)
 put_indicator (color_indicator[C_END]);
   else
 {
   put_indicator (color_indicator[C_LEFT]);
   put_indicator (color_indicator[C_RESET]);   --- here
   put_indicator (color_indicator[C_RIGHT]);
 }
 }
 
 To ensure, I did several test by changing 'no' and 'rs'
 with different values and observing printed escape sequences:
 
 LS_COLORS='no=x:rs=y:...'   # x,y = different SGR codes
 ls -l --color=always  foo

Pushed at http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=69410690

thanks!
Pádraig.





bug#19969: problem: wc -c doesn't read actual # of bytes in file

2015-03-02 Thread Pádraig Brady
On 02/03/15 21:29, Linda Walsh wrote:
 
 
 Jim Meyering wrote:
 As root:
 # cd /proc
 # find -H [^0-9]* -name self -prune -o -name thread-self -prune -o -type f !
 -name kmsg ! -name kcore ! -name kpagecount ! -name kpageflags -print0|wc -c
 --files0-from=- |sort -n

 Thanks for the report.
 However, with wc from coreutils-8.23 and a 3.10 kernel, this is no
 longer an issue.
 ---
 
 with coreutils 8.23 from suse 13.2 and uname:
 
 Linux Ishtar 3.18.5-Isht-Van #1 SMP PREEMPT Wed Feb 4 14:50:44 PST 2015 
 x86_64 x86_64 x86_64 GNU/Linux
 
 it is an issue.  All the /proc/sys entries are still 0.

The issue will be addressed in 8.24:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.23-47-g2662702

You can build this locally and test just with the src/wc binary
without worrying about overwriting any packaged binaries.

Pádraig.






bug#19951: tail: unrecognized file system type 0x013111a8 for `xxx.yyy'. Reverting to polling.

2015-03-01 Thread Pádraig Brady
On 27/02/15 18:02, Bernhard Voelker wrote:
 On 02/26/2015 07:33 AM, Seymour, Shane M wrote:
  case S_MAGIC_IBRIX: /* 0x013111A8 remote */
  return ibrix;
 
 Thanks for the report.
 As you already found the proper place to insert the code, and as
 it's a trivial change not requiring the copyright assignment pro-
 cedure, I'll push the attached in your name soon.
 
 For others: Shane provided the additional information I put into
 the commit message in an of-list conversation.
 I also added a proper NEWS entry.

+1

thanks to both of you.
Pádraig.






bug#19956: Solved (Re: df fails to show all mounted file systems)

2015-03-01 Thread Pádraig Brady
On 26/02/15 15:41, Vesa-Matti J Kari wrote:
 
 Hello,
 
 On Thu, 26 Feb 2015, Vesa-Matti J Kari wrote:
 
 [...] starting from coreutils-8.21 (i.e. the bug exist in 8.22 and 8.23
 too), lots of NFS mounts are missing:

 vmkari@cedi:/var/tmp/vmk$ coreutils-8.21/src/df
 Filesystem1K-blocks   Used Available Use% Mounted on
 /dev/mapper/rvg-root51343364334584799752  85% /
 devtmpfs811  0   811   0% /dev
 tmpfs   8128372  0   8128372   0% /dev/shm
 tmpfs   8128372  18860   8109512   1% /run
 tmpfs   8128372  0   8128372   0% /sys/fs/cgroup
 /dev/sda1520876 190144330732  37% /boot
 machine-1:/ua/group  1459291104 1113208544 271954816  81% /h/group
 machine-1:/b/scratch  101721248   75860064  20618816  79% /b/scratch
 machine-1:/h/a   1001028384  746104672 204068672  79% /h/a
 machine-1:/usr/local/yht  113402880   78201856  29433856  73% /usr/local/yht
 machine-1:/q/q113402528   78201952  29433536  73% /q/q
 machine-2:/var/spool/mail230608384   83144160 135929888  38% 
 /var/spool/mail

 The contents /etc/mtab are correct and mount command also works
 correctly. Could you look into this issue please?
 
 Wow. I just spent about three hours investigating and came up with this
 little patch for coreutils-8.20:
 
 --- sample starts ---
 
 --- coreutils-8.21/src/df.c   2013-02-05 01:40:31.0 +0200
 +++ coreutils-8.21-patched/src/df.c   2015-02-26 17:02:41.849872767 +0200
 @@ -624,8 +624,8 @@ filter_mount_list (void)
  }
else
  {
 -  /* If the device name is a real path name ...  */
 -  if (strchr (me-me_devname, '/'))
 +  /* If the device name is a real path name and not an NFS-mount ... 
  */
 +  if (strchr (me-me_devname, '/')  !strchr (me-me_devname, ':'))
  {
/* ... try to find its device number in the devlist.  */
for (devlist = devlist_head; devlist; devlist = devlist-next)
 
 --- sample ends --
 
 And all in vain because the 'df' utility in the latest coreutils git-repo
 already works correctly. :-)
 
 So sorry for the noise.
 
 Do you have plans to release coreutils-8.24 soon? The
 coreutils-8.22.11.el7 on Red Hat Enterprise Linux 7 / CentOS 7 is
 currently broken.

EL7 would get that backport independently of a new release.
That's now tracked at https://bugzilla.redhat.com/1197463

thanks,
Pádraig.






bug#13183: tail -f ignores SIGPIPE

2015-02-15 Thread Pádraig Brady
On 14/12/12 14:33, Pádraig Brady wrote:
 tag 13183 + notabug
 close 13183
 stop
 
 On 12/14/2012 02:04 PM, Ruediger Meier wrote:
 Hi,

 I want to use tail and grep to follow a file until a particular pattern
 appears. But tail does not exit when grep is finished.

 $ echo xxx  /tmp/blabla
 $ tail -f /tmp/blabla |grep -m1 --line-buffered xxx
 xxx

 Now tail still tries to read and exits only if I write again
 into /tmp/blabla.

 Is this how it's supposed to be?
 
 tail does exit on SIGPIPE, however it will
 only get the signal on write(), and so you
 need to get more data in the file before tail will exit.

It's a fair point though that tail, since it
can hang around forever should take special
steps to be responsive to the other end of the pipe going away.
I.E. it might use select() or poll(POLLHUP), to detect
immediately/periodically the other end of the pipe going away.

thanks,
Pádraig.

p.s. bug marked as wishlist





bug#19857: BUG with head -n-0

2015-02-13 Thread Pádraig Brady
unarchive 16329
forcemerge 19857 16329
stop

On 13/02/15 16:21, matshyeq wrote:
 Hi All,
 
 I think I've found an issue when head is called with -n-0 parameter
 It should return whole file but only seem to work for files with newline 
 character at the end.
 For those that don't have it - it doesn't return anything.
 
 BTW. I know -n-0 might be nonsensical but in my case simply pass the number 
 of rows to skip in a variable.

Fixed in version 8.23
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=476ce370

thanks,
Pádraig.





bug#19765: tail -F stops watching when read permissions are removed

2015-02-05 Thread Pádraig Brady
On 04/02/15 10:22, Stephane Chazelas wrote:
 When watchnig a file by name with tail -F, if read permissions
 are removed, tail stops watching even though it has a file
 descriptor open on the file.
 
 With inotify:
 
 $ :  file
 $ tail -F file 
 [1] 20796
 $ exec 3 file
 $ echo 1 3
 1
 $ chmod 0 file
 tail: %   
   
cannot watch ‘file’: Permission denied
 tail: ‘file’ has become inaccessible: Permission denied
 $ echo 2 3  ## not detected at this point
 $ chmod +r file
 tail: ‘file’ has become accessible
 # new content not displayed yet.
 $ echo 3 3
 1   ## all lines displayed
 2
 3
 
 Without inotify:
 
 $ :  file
 $ tail -F ---disable-inotify file 
 [1] 20903
 $ exec 3 file
 $ echo 1 3
 1
 $ chmod 0 file
 $ echo 2 3
 2  # not detected yet
 $ tail: ‘file’ has become inaccessible: Permission denied
 
 $ echo 3 3
 $ chmod +r file
 $ tail: ‘file’ has become accessible
 1
 2
 3
 
 (same except there's a delay before tail detects the file is no longer
 readable).
 
 Note that the file in that case is still accessible, one can
 still do a stat() on it to check that the file is still the same
 one. That's different from when one of the directory components
 becomes unreadable/unsearchable, in which case tail can't tell
 if it's still reading the right file as in:
 
tail -F foo/bar 
chmod 0 foo
 
 There, tail still has an open file descriptor to foo/bar, but
 can't tell if it still points to the foo/bar file, so it's
 acceptable for it to stop watching in that case.
 
 With inotify though, it doesn't unless the file attributes are
 changed (chmod...) or the file is renamed. I think I'll raise a
 separate bug report for that and directory components being
 renamed.
 
 [tested with git head]
 

Handling of files that only change perms is awkward.
In the inotify case we don't close the associated watch descriptor
so continue to process events, though ignore them as we've closed the file.

Another problem in this situation with inotify is that write events between
the chmod a-r and a+r are lost, thus not outputting new data until
the next write event.

Another larger problem in this situation with and without inotify
is that the whole file is output, when tail outputs next.
That's documented as a FIXME-maybe in the code.

thanks,
Pádraig.





bug#19784: build fails on make-prime-list when asan is enabled

2015-02-05 Thread Pádraig Brady
On 05/02/15 15:21, Yury Usishchev wrote:
 Hello!
 
 We tried to build coreutils with address sanitizer enabled and 
 encountered an error:
 
GEN  src/primes.h
 ==12657== ERROR: AddressSanitizer: heap-buffer-overflow
 
 This can be reproduced on git master using gcc-4.8 or gcc-4.9 by
 git clone
 export CFLAGS=-fsanitize=address
 ./bootstrap
 ./configure
 make
 
 and is caused by line
 src/make-prime-list.c:214:  while (i  size  sieve[++i] == 0)
 
 When 'i' reaches 'size-1' it gets incremented and then 
 (unallocated)memory is accessed.
 
 I attached patch that can fix this issue.

Oh nice one. That was not rerun when I ran my checks.
The released tools (still) pass with -fsanitize=address.

How about this fix instead?  I'll push in your name if
you're ok with it.

diff --git a/src/make-prime-list.c b/src/make-prime-list.c
index 68c972a..69b91e8 100644
--- a/src/make-prime-list.c
+++ b/src/make-prime-list.c
@@ -211,7 +211,7 @@ main (int argc, char **argv)
   for (j = (p*p - 3)/2; j  size; j+= p)
 sieve[j] = 0;

-  while (i  size  sieve[++i] == 0)
+  while (++i  size  sieve[i] == 0)
 ;
 }






bug#19760: [bug] tail -f with inotify fails to follow a file after a rename()

2015-02-03 Thread Pádraig Brady
On 03/02/15 22:04, Stephane Chazelas wrote:
 Hello,
 
 On Linux, when inotify is used,
 
tail -f file
 
 follows a file only until it's renamed. After it is renamed, the
 inotify watch is removed, which means tail sits there doing
 nothing and any further modifications to the file are ignored.
 
 To reproduce:
 
 echo 1  file
 tail -f file 
 exec 3 file
 echo 2 3
 sleep 1
 mv file file2
 sleep 1
 echo 3 3
 sleep 1
 :  file2
 
 3 is not displayed. No message about the file being truncated
 either.
 
 Work arounds:
 
tail ---disable-inotify -f file
tail -f  file # effectively disables inotify
 
or rename the file with a link() followed by an unlink()
ln file newfile  rm -f file
 
 Note that the IN_DELETED_SELF event is not reached in
 follow-descriptor mode because tail has the file open preventing
 it from being deleted even after it's unlinked from the last
 directory.
 
 Patch attached (on the current git head).

Ouch. The patch makes sense on first glance,
and all existing tests pass with it.
I'll check some more and add a test.

thanks!
Pádraig.





bug#19747: Testsuite summary for GNU coreutils 8.23 - FAIL: 1

2015-02-02 Thread Pádraig Brady
On 02/02/15 20:53, Miras wrote:
 Hi,
 When testing the package coreutils-8.23.tar.xz appeared one error. The 
 package is has built correctly. I was building a Linux system using the LFS 
 Linux from scratch.

 FAIL: tests/df/skip-duplicates
 ==

 + df --local
 Filesystem 1K-blocksUsed Available Use% Mounted on
 /dev/sdc6  130216524 2060392 121518412   2% /home
 udev 4028936   4   4028932   1% /dev
 tmpfs4053408   8   4053400   1% /run

 + LD_PRELOAD=./k.so
 + df -T
 ++ wc -l
 ++ expr 1 + 2
 + test 2 -eq 3
 + fail=1
 -- 
 *Mirosław Walczak
 Linux Ubuntu 14.04*

I think this should already be addressed with:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=ed1a495b

thanks,
Pádraig.


bug#19725: Correction for md5sum manual page

2015-01-29 Thread Pádraig Brady
On 29/01/15 16:11, Terry Hoye wrote:
 
 Regarding GNU coreutils 8.12.197-032bbSeptember 2011,
 man md5sum states in part:
 [quote]
 The sums are computed as described in RFC  1321.   When  checking,  the
 input  should  be a former output of this program.  The default mode is
 to print a line with checksum, a character  indicating  type  (`*'  for
 binary, ` ' for text), and name for each FILE.[/quote]
 
 I think I am correct in the following observation:
 Taken literally, the second sentence is incorrect. The default mode has
 two characters, often both spaces, between the two stringsets.

True.
The attached should make this more accurate.

 I don't think the algorithm works correctly with only one space between the 
 stringsets.

Well --check still works in this case as it's switching
to bsd reversed mode. I.E. supporting the output from `md5 -r` etc.
I've made this clearer also, in the info docs.

thanks,
Pádraig.

From ecc8f8104893f1eaae8a8e7e3e55c8c42f4c7206 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Thu, 29 Jan 2015 18:44:41 +
Subject: [PATCH] doc: clarify the output format for the *sum utilities

* src/md5sum.c (usage): Detail the reasons for the default
double space between checksum and file name.
* doc/coreutils.texi (md5sum invocation): Likewise.
Explicitly mention the 3 formats that --check supports.

Fixes http://bugs.gnu.org/19725
---
 doc/coreutils.texi | 15 ++-
 src/md5sum.c   |  6 +++---
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 99c7df3..0a82b65 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4057,8 +4057,11 @@ consistent.  Synopsis:
 md5sum [@var{option}]@dots{} [@var{file}]@dots{}
 @end example
 
-For each @var{file}, @samp{md5sum} outputs the MD5 checksum, a flag
-indicating binary or text input mode, and the file name.
+For each @var{file}, @samp{md5sum} outputs by default, the MD5 checksum,
+a space, a flag indicating binary or text input mode, and the file name.
+Binary mode is indicated with @samp{*}, text mode with @samp{ } (space).
+Binary mode is the default on systems where it's significant,
+otherwise text mode is the default.
 If @var{file} contains a backslash or newline, the
 line is started with a backslash, and each problematic character in
 the file name is escaped with a backslash, making the output
@@ -4089,9 +4092,11 @@ Read file names and checksum information (not data) from each
 whether the checksums match the contents of the named files.
 The input to this mode of @command{md5sum} is usually the output of
 a prior, checksum-generating run of @samp{md5sum}.
-Each valid line of input consists of an MD5 checksum, a binary/text
-flag, and then a file name.
-Binary mode is indicated with @samp{*}, text with @samp{ } (space).
+Three input formats are supported.  Either the default output
+format described above, the @option{--tag} output format,
+or the BSD reversed mode format which is similar to the default mode,
+but doesn't use a character to distinguish binary and text modes.
+@sp 1
 For each such line, @command{md5sum} reads the named file and computes its
 MD5 checksum.  Then, if the computed message digest does not match the
 one on the line with the file name, the file is noted as having
diff --git a/src/md5sum.c b/src/md5sum.c
index a60e2ff..8c5f876 100644
--- a/src/md5sum.c
+++ b/src/md5sum.c
@@ -206,9 +206,9 @@ The following four options are useful only when verifying checksums:\n\
   printf (_(\
 \n\
 The sums are computed as described in %s.  When checking, the input\n\
-should be a former output of this program.  The default mode is to print\n\
-a line with checksum, a character indicating input mode ('*' for binary,\n\
-space for text), and name for each FILE.\n),
+should be a former output of this program.  The default mode is to print a\n\
+line with checksum, a space, a character indicating input mode ('*' for binary,\
+\n' ' for text or where binary is insignificant), and name for each FILE.\n),
   DIGEST_REFERENCE);
   emit_ancillary_info (PROGRAM_NAME);
 }
-- 
2.1.0



bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-28 Thread Pádraig Brady
On 28/01/15 08:17, Bernhard Voelker wrote:
 On 01/27/2015 03:58 PM, Pádraig Brady wrote:
 From 12c6f0fd7f44133a2af8950c69b2bfa46ea5d3a4 Mon Sep 17 00:00:00 2001
 From: Giuseppe Scrivano gscriv...@gnu.org
 Date: Sun, 25 Jan 2015 01:33:45 +0100
 Subject: [PATCH] sync: support syncing specified arguments
 
 --- a/doc/coreutils.texi
 +++ b/doc/coreutils.texi
 @@ -12043,18 +12043,40 @@ with @env{TZ}, libc, The GNU C Library Reference 
 Manual}.
  @command{sync} writes any data buffered in memory out to disk.  This can
  include (but is not limited to) modified superblocks, modified inodes,
  and delayed reads and writes.  This must be implemented by the kernel;
 -The @command{sync} program does nothing but exercise the @code{sync} system
 -call.
 +The @command{sync} program does nothing but exercise the @code{sync},
 +@code{syncfs}, @code{fsync}, and @code{fdatasync} system calls.
 
 I think sync's info page now deserves a Synopsis line ... as the command
 now takes more than just --help/--version.

Done.

 Maybe the first line of 'man sync'
 
   sync - flush file system buffers
 
 and 'info sync'
 
   synchronize memory and disk (in the parent table), and
   sync data on disk with memory (sync invocation)
 
 should be harmonized, too?

Good point. I went with this summary everywhere:

  Synchronize cached writes to persistent storage

 diff --git a/src/sync.c b/src/sync.c
 index e9f4d7e..80d1403 100644
 --- a/src/sync.c
 +++ b/src/sync.c
 
 @@ -37,11 +61,20 @@ usage (int status)
  emit_try_help ();
else
  {
 -  printf (_(Usage: %s [OPTION]\n), program_name);
 +  printf (_(Usage: %s [OPTION] [FILE]...\n), program_name);
fputs (_(\
  Force changed blocks to disk, update the super block.\n\
  \n\
 +If one or more file paths are specified, sync only them,\n\
 +use --data and --file-system to change the default behavior\n\
 +\n\
  ), stdout);
 +
 +  fputs (_(\
 +  --file-system  sync the file systems that contain the files\n\
 +  --data only sync data for files, no unneeded 
 metadata\n\
 +), stdout);
 +
 
 '--d' should go before '--f'.
 
 And shouldn't we also be more translator-friendly, and split the
 2 options into 2 fputs calls?

Good point. also the spacing was off.
Also I added short options to to align with -f, --file-system in stat(1).
If this ever was to be standardised, or used in other systems
short options would be used, so we might as well give some precedence
here to ease compat.

 The rest of the patch including the test almost LGTM:
 when running against a non-accessible directory, then the correct error
 diagnostic (permission denied) is eclipsed by a non-descriptive
 diagnostic:
 
   $ src/sync --file /tmp /root
   src/sync: error opening ‘/root’: Is a directory
 
 strace output of the above:
 
   open(/root, O_RDONLY|O_NONBLOCK)  = -1 EACCES (Permission denied)
   open(/root, O_WRONLY|O_NONBLOCK)  = -1 EISDIR (Is a directory)

Good catch!
I've gone with always reporting the first errno.

thanks again for the excellent review.

Latest is attached.

Pádraig.
From 2caa949db1fc16d1bdcf547d65001a0e430a2b27 Mon Sep 17 00:00:00 2001
From: Giuseppe Scrivano gscriv...@gnu.org
Date: Sun, 25 Jan 2015 01:33:45 +0100
Subject: [PATCH] sync: support syncing specified arguments

* m4/jm-macros.m4 (coreutils_MACROS): Check for syncfs().
* man/sync.x: Add references to syncfs, fsync and fdatasync.
* doc/coreutils.texi (sync invocation): Document the new feature.
* src/sync.c: Include quote.h.
(AUTHORS): Include myself.
(MODE_FILE, MODE_DATA, MODE_FILE_SYSTEM, MODE_SYNC): New enum values.
(long_options): Define.
(sync_arg): New function.
(usage): Describe that arguments are now accepted.
(main): Add arguments parsing and add support for fsync(2),
fdatasync(2) and syncfs(2).
* tests/misc/sync.sh: New (and only) test for sync.
* tests/local.mk: Reference the new test.
* AUTHORS: Add myself to sync's authors.
* NEWS: Mention the new feature.
---
 AUTHORS|   2 +-
 NEWS   |   3 +
 doc/coreutils.texi |  52 +---
 m4/jm-macros.m4|   1 +
 man/sync.x |   4 +-
 src/sync.c | 177 +
 tests/local.mk |   1 +
 tests/misc/sync.sh |  43 +
 8 files changed, 258 insertions(+), 25 deletions(-)
 create mode 100755 tests/misc/sync.sh

diff --git a/AUTHORS b/AUTHORS
index 0296830..64c11d7 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -83,7 +83,7 @@ stat: Michael Meskes
 stdbuf: Pádraig Brady
 stty: David MacKenzie
 sum: Kayvan Aghaiepour, David MacKenzie
-sync: Jim Meyering
+sync: Jim Meyering, Giuseppe Scrivano
 tac: Jay Lepreau, David MacKenzie
 tail: Paul Rubin, David MacKenzie, Ian Lance Taylor, Jim Meyering
 tee: Mike Parker, Richard M. Stallman, David MacKenzie
diff --git a/NEWS b/NEWS
index 73314d7..b3641ca 100644
--- a/NEWS
+++ b/NEWS
@@ -51,6 +51,9 @@ GNU coreutils NEWS-*- outline -*-
   stty allows setting

bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-27 Thread Pádraig Brady
On 27/01/15 10:48, Giuseppe Scrivano wrote:
 Pádraig Brady p...@draigbrady.com writes:
 
 thanks!
 Pádraig.
 
 Thanks for the review, I've amended the changes you suggested:

There were a few problems:

 - Compile failure without HAVE_SYNCFS (defined)
 - Various errors() used errno where undefined
 - A file that failed to sync was not identified
 - File descriptors were leaked on failure to sync
 - The command would block if presented with a fifo
 - Write only files ere not supported
 - Various references to paths rather than files
 - The info docs for --file-system were a bit terse
 - No tests

I've fixed up those issues in the attached,
which I'll push soon.

thanks,
Pádraig.

From 12c6f0fd7f44133a2af8950c69b2bfa46ea5d3a4 Mon Sep 17 00:00:00 2001
From: Giuseppe Scrivano gscriv...@gnu.org
Date: Sun, 25 Jan 2015 01:33:45 +0100
Subject: [PATCH] sync: support syncing specified arguments

* AUTHORS: Add myself to sync's authors.
* NEWS: Mention the new feature.
* m4/jm-macros.m4 (coreutils_MACROS): Check for syncfs().
* man/sync.x: Add references to syncfs, fsync and fdatasync.
* doc/coreutils.texi (sync invocation): Document the new feature.
* src/sync.c: Include quote.h.
(AUTHORS): Include myself.
(MODE_FILE, MODE_DATA, MODE_FILE_SYSTEM, MODE_SYNC): New enum values.
(long_options): Define.
(sync_arg): New function.
(usage): Describe that arguments are now accepted.
(main): Add arguments parsing and add support for fsync(2),
fdatasync(2) and syncfs(2).
---
 AUTHORS|   2 +-
 NEWS   |   3 +
 doc/coreutils.texi |  34 +--
 m4/jm-macros.m4|   1 +
 man/sync.x |   2 +-
 src/sync.c | 169 +
 tests/local.mk |   1 +
 tests/misc/sync.sh |  43 ++
 8 files changed, 236 insertions(+), 19 deletions(-)
 create mode 100755 tests/misc/sync.sh

diff --git a/AUTHORS b/AUTHORS
index 0296830..64c11d7 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -83,7 +83,7 @@ stat: Michael Meskes
 stdbuf: Pádraig Brady
 stty: David MacKenzie
 sum: Kayvan Aghaiepour, David MacKenzie
-sync: Jim Meyering
+sync: Jim Meyering, Giuseppe Scrivano
 tac: Jay Lepreau, David MacKenzie
 tail: Paul Rubin, David MacKenzie, Ian Lance Taylor, Jim Meyering
 tee: Mike Parker, Richard M. Stallman, David MacKenzie
diff --git a/NEWS b/NEWS
index 73314d7..b3641ca 100644
--- a/NEWS
+++ b/NEWS
@@ -51,6 +51,9 @@ GNU coreutils NEWS-*- outline -*-
   stty allows setting the extproc option where supported, which is
   a useful setting with high latency links.
 
+  sync no longer ignores arguments, and syncs each specified file, or with the
+  --file-system option, the file systems associated with each specified file.
+
 ** Changes in behavior
 
   df no longer suppresses separate exports of the same remote device, as
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 4a15939..7e62bde 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -12043,18 +12043,40 @@ with @env{TZ}, libc, The GNU C Library Reference Manual}.
 @command{sync} writes any data buffered in memory out to disk.  This can
 include (but is not limited to) modified superblocks, modified inodes,
 and delayed reads and writes.  This must be implemented by the kernel;
-The @command{sync} program does nothing but exercise the @code{sync} system
-call.
+The @command{sync} program does nothing but exercise the @code{sync},
+@code{syncfs}, @code{fsync}, and @code{fdatasync} system calls.
 
 @cindex crashes and corruption
 The kernel keeps data in memory to avoid doing (relatively slow) disk
 reads and writes.  This improves performance, but if the computer
 crashes, data may be lost or the file system corrupted as a
-result.  The @command{sync} command ensures everything in memory
-is written to disk.
+result.  The @command{sync} command instructs the kernel to write
+data in memory to persistent storage.
 
-Any arguments are ignored, except for a lone @option{--help} or
-@option{--version} (@pxref{Common options}).
+If any argument is specified then only those files will be
+synchronized using the fsync(2) syscall by default.
+
+If at least one file is specified, it is possible to change the
+synchronization method with the following options.  Also see
+@ref{Common options}.
+
+@table @samp
+@item --data
+@opindex --data
+Use fdatasync(2) to sync only the data for the file,
+and any metadata required to maintain file system consistency.
+
+@item --file-system
+@opindex --file-system
+Synchronize all the I/O waiting for the file systems that contain the file,
+using the syscall syncfs(2).  Note you would usually @emph{not} specify
+this option if passing a device node like @samp{/dev/sda} for example,
+as that would sync the containing file system rather than the referenced one.
+Note also that depending on the system, passing individual device nodes or files
+may have different sync characteristics than using no arguments.
+I.E. arguments passed to fsync

bug#19705: unrecognized file system type

2015-01-27 Thread Pádraig Brady
On 27/01/15 16:50, Nicolas St-Pierre wrote:
 tail: unrecognized file system type 0xf2f52010 for â.logâ. please report 
 this to bug-coreutils@gnu.org mailto:bug-coreutils@gnu.org. reverting to 
 polling

Already fixed with:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=4c49dc82

thanks,
Pádraig.





bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-26 Thread Pádraig Brady
On 26/01/15 08:36, Giuseppe Scrivano wrote:
 Pádraig Brady p...@draigbrady.com writes:
 
 On 25/01/15 18:05, Bernhard Voelker wrote:
 On 01/25/2015 06:41 PM, Pádraig Brady wrote:
 So we have: fdatasync  fsync  syncfs  sync
 referring to:: file data, file data + metadata, file system, all file 
 systems

 [...]

 I'd be incline to go with the _what_ interface above.

 Either way, I think it's important to document sync is falling back
 to the bigger hammer if the smaller failed.
 ... or shouldn't do sync this?

 It should fall back where possible.

 Now there is a difference between the file and file system(s) interfaces
 in that the former can return EIO error for example, while the latter
 are specified to always return success. You wouldn't fall back to
 a syncfs() if an fsync() gave an EIO for example.  Also gnulib
 guarantees that fsync() and fdatasync() are available, so I wouldn't
 fallback from file - file system interfaces, nor between file interfaces.
 
 one risk here is when multiple arguments are specified and the fsync
 will return EIO more than once, we will fallback to syncfs multiple
 times.  Couldn't in this case a single sync be a better choice?

I was saying we shouldn't fall back from fsync() to syncfs().
Just process each argument. Diagnose any errors and EXIT_FAILURE
if there was any error?

Pádraig.





bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-26 Thread Pádraig Brady
On 26/01/15 21:27, Giuseppe Scrivano wrote:
 Pádraig Brady p...@draigbrady.com writes:
 
 On 26/01/15 08:36, Giuseppe Scrivano wrote:
 Pádraig Brady p...@draigbrady.com writes:

 On 25/01/15 18:05, Bernhard Voelker wrote:
 On 01/25/2015 06:41 PM, Pádraig Brady wrote:
 So we have: fdatasync  fsync  syncfs  sync
 referring to:: file data, file data + metadata, file system, all file 
 systems

 [...]

 I'd be incline to go with the _what_ interface above.

 Either way, I think it's important to document sync is falling back
 to the bigger hammer if the smaller failed.
 ... or shouldn't do sync this?

 It should fall back where possible.

 Now there is a difference between the file and file system(s) interfaces
 in that the former can return EIO error for example, while the latter
 are specified to always return success. You wouldn't fall back to
 a syncfs() if an fsync() gave an EIO for example.  Also gnulib
 guarantees that fsync() and fdatasync() are available, so I wouldn't
 fallback from file - file system interfaces, nor between file interfaces.

 one risk here is when multiple arguments are specified and the fsync
 will return EIO more than once, we will fallback to syncfs multiple
 times.  Couldn't in this case a single sync be a better choice?

 I was saying we shouldn't fall back from fsync() to syncfs().
 Just process each argument. Diagnose any errors and EXIT_FAILURE
 if there was any error?
 
 sorry for misunderstanding that.
 
 I've worked out a new version that includes these suggestions, also
 since now the user can explicitly ask for the sync mechanism to use, I
 agree with you and we should raise an error if something goes wrong.
 
 Since sync is completely different now, I took the freedom to add myself
 to the AUTHORS, feel free to drop this part if you disagree.
 
 Regards,
 Giuseppe
 
 
 
From 0dbc5ce9c78bc97ec5a678803270767ad9980618 Mon Sep 17 00:00:00 2001
 From: Giuseppe Scrivano gscri...@redhat.com
 Date: Sun, 25 Jan 2015 01:33:45 +0100
 Subject: [PATCH] sync: add support for fsync(2), fdatasync(2) and syncfs(2)
 
 * AUTHORS: Add myself to sync's authors.
 * NEWS: Mention the new feature.
 * m4/jm-macros.m4 (coreutils_MACROS): Check for syncfs.
 * doc/coreutils.texi (sync invocation): Document the new feature.
 * src/sync.c: Include quote.h.
 (AUTHORS): Include myself.
 (MODE_FILE, MODE_FILE_DATA, MODE_FILE_SYSTEM): New enum values.
 (long_options): Define.
 (usage): Describe that arguments are now accepted.
 (main): Add arguments parsing and add support for fsync(2),
 fdatasync(2) and syncfs(2).
 ---
  AUTHORS|   2 +-
  NEWS   |   3 ++
  doc/coreutils.texi |  20 -
  m4/jm-macros.m4|   1 +
  src/sync.c | 116 
 +
  5 files changed, 131 insertions(+), 11 deletions(-)
 
 diff --git a/AUTHORS b/AUTHORS
 index 0296830..64c11d7 100644
 --- a/AUTHORS
 +++ b/AUTHORS
 @@ -83,7 +83,7 @@ stat: Michael Meskes
  stdbuf: Pádraig Brady
  stty: David MacKenzie
  sum: Kayvan Aghaiepour, David MacKenzie
 -sync: Jim Meyering
 +sync: Jim Meyering, Giuseppe Scrivano
  tac: Jay Lepreau, David MacKenzie
  tail: Paul Rubin, David MacKenzie, Ian Lance Taylor, Jim Meyering
  tee: Mike Parker, Richard M. Stallman, David MacKenzie
 diff --git a/NEWS b/NEWS
 index e0a2893..3d4190b 100644
 --- a/NEWS
 +++ b/NEWS
 @@ -48,6 +48,9 @@ GNU coreutils NEWS-*- 
 outline -*-
split accepts a new --separator option to select a record separator 
 character
other than the default newline character.
  
 +  sync no longer ignores arguments and it uses fsync(2), fdatasync(2)
 +  and syncfs(2) synchronization in addition to sync(2).

sync no longer ignores arguments, and syncs each specified file, or with the
--file-system option, the file systems associated with each specified file.

  ** Changes in behavior
  
df no longer suppresses separate exports of the same remote device, as
 diff --git a/doc/coreutils.texi b/doc/coreutils.texi
 index 5a3c31a..c99b8ed 100644
 --- a/doc/coreutils.texi
 +++ b/doc/coreutils.texi
 @@ -12053,8 +12053,24 @@ crashes, data may be lost or the file system 
 corrupted as a
  result.  The @command{sync} command ensures everything in memory
  is written to disk.
  
 -Any arguments are ignored, except for a lone @option{--help} or
 -@option{--version} (@pxref{Common options}).
 +If any argument is specified then only the specified paths will be

s/paths/files/.  paths is a bit ambiguous, while files implies dirs too.

 +synchronized.  It uses internally the syscall fsync(2) on each of them.
 +
 +If at least one path is specified, it is possible to change the
 +synchronization policy with the following options.  Also see
 +@ref{Common options}.
 +
 +@table @samp
 +@item --data
 +@opindex --data
 +Do not synchronize the file metadata unless it is required to maintain
 +data integrity.  It uses the syscall fdatasync(2).
 +
 +@item --file-system
 +@opindex --file

bug#7420: [Feature request]: add option to dd to fsync|fdatasync after each block written

2015-01-25 Thread Pádraig Brady
unarchive 7420
tag 7420 notabug
close 7420
stop

On 17/11/10 10:19, Марк Коренберг wrote:
 [Feature request]: add option to dd to fsync|fdatasync after each block 
 written
 
 Suppose I want to show progress with:
 
 pv image.img | dd bs=16M of=/dev/sdc
 
 it will not work, as dd will write to sdc momentarily. dd will hang on
 close(1) waiting for actual write to complete (tested on USB stick
 Linux 2.6.32)
 
 I decide to use  oflag=direct. It help, OK. But: 
 http://kerneltrap.org/node/7563
 
 It will be nice if, dd will be able to fsync/fdatasync after each block.
 
 I think, it is useful for other usages.

This is useful.
However supporting functionality was added to the next kernel version 2.6.33
Or rather O_SYNC ad O_DSYNC were properly distinguished: 
http://lwn.net/Articles/350225/
Both options were available to through dd long before that.
In summary dd oflag=dsync should now do exactly as you expect.

thanks,
Pádraig.





bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-25 Thread Pádraig Brady
On 25/01/15 18:05, Bernhard Voelker wrote:
 On 01/25/2015 06:41 PM, Pádraig Brady wrote:
 So we have: fdatasync  fsync  syncfs  sync
 referring to:: file data, file data + metadata, file system, all file systems
 
 [...]
 
 I'd be incline to go with the _what_ interface above.
 
 Either way, I think it's important to document sync is falling back
 to the bigger hammer if the smaller failed.
 ... or shouldn't do sync this?

It should fall back where possible.

Now there is a difference between the file and file system(s) interfaces
in that the former can return EIO error for example, while the latter
are specified to always return success. You wouldn't fall back to
a syncfs() if an fsync() gave an EIO for example.  Also gnulib
guarantees that fsync() and fdatasync() are available, so I wouldn't
fallback from file - file system interfaces, nor between file interfaces.
There is an edge case with fsync(fifo) for example where EINVAL is given,
though again since there is no data in the file system associated with that,
I wouldn't bother falling back to a file system interface.

As for documenting the fall back, that's one of the reasons I
suggested keeping the warning if syncfs() is not supported,
which should be enough direct feedback about the extra syncing happening.

cheers,
Pádraig.





bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-25 Thread Pádraig Brady
On 25/01/15 15:38, Paul Eggert wrote:
 If we're adding this sort of option, shouldn't we also give users the ability 
 to 
 invoke fsync and fdatasync on a single file, as opposed to syncfs on an 
 entire 
 file system?

Yes good point on also integrating per file syncing.
Per file syncing is already supported by dd,
but would be a natural fit to sync(1).

BTW I grepped my internal notes on sync and noticed 
http://lwn.net/Articles/433384/
which details syncfs() and also suggests `sync file...` as an interface.

So we have: fdatasync  fsync  syncfs  sync
referring to:: file data, file data + metadata, file system, all file systems

You'd need an option to distinguish between file and file system.
The option could refer to _what_ to sync, or what _method_ to use to sync.

An advantage of specifying the method is they're well
understood and described elsewhere:
  --mode={fdatasync, fsync, syncfs, sync}
The first two would give an error without arguments,
the last would give an error with arguments.

The alternative _what_ interface might look like:
  --what={data, file-system}
or split out to separate options like:
  -d,--data, -f,--file-system
An advantage of specifying what to sync is that
it's a bit more general, avoiding specifying implementation methods.
Also the options would not imply needing or not needing to specify files.

I'd be incline to go with the _what_ interface above.

thanks,
Pádraig





bug#7421: [Feature request]: add option to dd to issue ioctl(BLKFLSBUF) on output descriptior after each write or at the end

2015-01-25 Thread Pádraig Brady
unarchive 7421
tag 7421 wontfix
close 7421
stop

On 17/11/10 10:41, Марк Коренберг wrote:
 [Feature request]: add option to dd to issue ioctl(BLKFLSBUF) on
 output descriptior after each write or at the end
 
 I already has sent a message about fsync/fdatasync after each write.
 It seems that ioctl(BLKFLSBUF) need be implemented in same semantics.
 
 in oflags and conv
 i.e. ioctl after each write oflags=blkflsbuf
 and ioctl at the end, if specified in conv=blkflsbuf

This is already supported with the `blockdev --flushbufs` command.
Given this is a low level linux specific interface, it's more suited
to the blockdev command than a more general tool like dd.
Note ioctl(BLKFLSBUF) only writes out dirty pages to the block device,
it doesn't guarantee to send a flush request to the device.
Also http://lwn.net/Articles/433384/ mentions that BLKFLSBUF also
invalidates the bdev mapping, which isn't generally desirable,
and doesn't work for non-block file systems.

We're considering adding syncfs() support to the sync command
which should cater for much of the use case you describe here:
That's discussed at http://bugs.gnu.org/19681

thanks,
Pádraig.





bug#19681: [PATCH] sync: use syncfs(2) if any argument is specified

2015-01-24 Thread Pádraig Brady
thanks I like it!
Tweaks below...

On 25/01/15 00:48, Giuseppe Scrivano wrote:
 * configure.ac: Check if syncfs(2) is available.
 * NEWS: Mention the new feature.
 * doc/coreutils.texi (sync invocation): Document the new feature.
 * src/sync.c (usage): Describe that arguments are now accepted.
 (main): Use syncfs(2) to flush buffers for the file system which
 contain the specified arguments.  Silently fallback to sync(2) on
 errors.
 ---
  NEWS   |  3 +++
  configure.ac   |  2 ++
  doc/coreutils.texi |  7 ++-
  src/sync.c | 31 +--
  4 files changed, 40 insertions(+), 3 deletions(-)
 
 diff --git a/NEWS b/NEWS
 index e0a2893..42bd02f 100644
 --- a/NEWS
 +++ b/NEWS
 @@ -48,6 +48,9 @@ GNU coreutils NEWS-*- 
 outline -*-
split accepts a new --separator option to select a record separator 
 character
other than the default newline character.
  
 +  sync accepts arguments, and if any is specified use syncfs(2) to
 +  flush the buffers for the file systems which cointain these paths.

sync no longer ignores arguments, and now uses syncfs(2) to sync
the file systems associated with each specified path.

 +
  ** Changes in behavior
  
df no longer suppresses separate exports of the same remote device, as
 diff --git a/configure.ac b/configure.ac
 index 3918f43..8fcfec9 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -328,6 +328,8 @@ if test $ac_cv_func_syslog = no; then
done
  fi
  
 +AC_CHECK_FUNCS([syncfs])
 +
  AC_CACHE_CHECK([for 3-argument setpriority function],
[utils_cv_func_setpriority],
[AC_LINK_IFELSE(
 diff --git a/doc/coreutils.texi b/doc/coreutils.texi
 index 5a3c31a..6cc7414 100644
 --- a/doc/coreutils.texi
 +++ b/doc/coreutils.texi
 @@ -12053,6 +12053,10 @@ crashes, data may be lost or the file system 
 corrupted as a
  result.  The @command{sync} command ensures everything in memory
  is written to disk.
  
 +If any argument is specified and the system supports the synfcs(2)
 +syscall, then only the file systems containing these paths will be
 +synchronized.  If multiple paths point to the same file system, the
 +syncfs(2) syscall will be invoked for each one of them.
  Any arguments are ignored, except for a lone @option{--help} or
  @option{--version} (@pxref{Common options}).
  
 @@ -12081,7 +12085,8 @@ If a @var{file} is larger than the specified size, 
 the extra data is lost.
  If a @var{file} is shorter, it is extended and the extended part (or hole)
  reads as zero bytes.
  
 -The program accepts the following options.  Also see @ref{Common options}.
 +The only options are a lone @option{--help} or @option{--version}.
 +@xref{Common options}.
  
  @table @samp
  
 diff --git a/src/sync.c b/src/sync.c
 index e9f4d7e..940836e 100644
 --- a/src/sync.c
 +++ b/src/sync.c
 @@ -37,10 +37,13 @@ usage (int status)
  emit_try_help ();
else
  {
 -  printf (_(Usage: %s [OPTION]\n), program_name);
 +  printf (_(Usage: %s [OPTION] [PATH]...\n), program_name);
fputs (_(\
  Force changed blocks to disk, update the super block.\n\
  \n\
 +If one or more file paths are specified, update only the\n\
 +file-systems which contain those files.\n\
 +\n\
  ), stdout);
fputs (HELP_OPTION_DESCRIPTION, stdout);
fputs (VERSION_OPTION_DESCRIPTION, stdout);
 @@ -65,9 +68,33 @@ main (int argc, char **argv)
if (getopt_long (argc, argv, , NULL, NULL) != -1)
  usage (EXIT_FAILURE);
  
 +#if HAVE_SYNCFS
 +  /* If arguments are specified, use syncfs on any of them.
 + On any error, silently fallback to sync.  */
if (optind  argc)
 -error (0, 0, _(ignoring all arguments));

The warning above should be moved down to the sync: case
rather than removing it.

 +{
 +  while (optind  argc)
 +{
 +  int fd = open (argv[optind], O_RDONLY);
 +  if (fd  0)
 +goto sync;
 +
 +  if (syncfs (fd)  0)
 +{
 +  close (fd);
 +  goto sync;
 +}
 +
 +  if (close (fd)  0)
 +goto sync;
 +
 +  optind++;
 +}
 +  return EXIT_SUCCESS;
 +}
 +#endif
  
 +sync:
sync ();
return EXIT_SUCCESS;
  }
 

thanks!
Pádraig





bug#19654: tr command odd behavior on Linux Platform

2015-01-22 Thread Pádraig Brady
tag 19654 notabug
close 19654
stop

On 22/01/15 09:52, Kousik Mandal wrote:
 Hi Team,
 
 I am observing an unexpected behavior with tr command. In a directory if 
 there exist one file named a and executing following tr command.

This is becoming a FAQ:
http://lists.gnu.org/archive/html/coreutils/2014-12/msg00041.html

thanks,
Pádraig.






bug#19609: coreutils tests on illumos

2015-01-19 Thread Pádraig Brady
tag 19609 notabug
close 19609
stop

On 15/01/15 18:05, Alexander Pyhalov wrote:
 Hello.
 I was looking at coreutils test suite on illumos.
 There are currently about 5 failing tests.
 
 First are 2 df tests. They assume that struct mntent *getmntent (FILE 
 *fp) is used by df. But it's not true on SystemV. Instead, int getmntent 
 (FILE *fp, struct mnttab *mp) is used. Patches to correct tests on 
 illumos is here:
 https://github.com/OpenIndiana/oi-userland/blob/oi/hipster/components/coreutils/patches/tests_df_no-mtab-status.sh.patch
 https://github.com/OpenIndiana/oi-userland/blob/oi/hipster/components/coreutils/patches/tests_df_skip-duplicates.sh.patch

The above are already addressed in v8.23 with:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=39e2a4cf

 3 remaining failing tests are from glib-tests directory.
 
 First one is test-locale.c. It assumes that LC_GLOBAL_LOCALE is a 
 constant. On illumos it's a function, so compilation fails. Fix is here: 
 https://github.com/OpenIndiana/oi-userland/blob/oi/hipster/components/coreutils/patches/gnulib-tests_test-locale.c.patch
 
 Second is test-getlogin. It seems to be a bit strange: it compares $USER 
 and getlogin(). Under su they don't match, so it can occasionally fail.

There have been a few changes to that test in gnulib since
coreutils 8.21 was released, so I'm hopeful that's addressed too.

 Third (test-getlogin) is a bit tricky.

I presume you mean test-localename.c

 It uses gl_locale_name_thread_unsafe, which simply doesn't know about 
 illumos/Solaris and returns NULL, so the test fails. I don't have a fix 
 for this one.

I'll follow up on bug-gnu...@gnu.org

thanks,
Pádraig





bug#19605: cp -v vs LC_ALL vs. quote marks

2015-01-15 Thread Pádraig Brady
On 15/01/15 16:01, 積丹尼 Dan Jacobson wrote:
 I'm saying please don't force me to need LC_ALL=C to make the quotes
 U+0027 APOSTROPHE always.
 
 Long ago there were no quotes.
 
 Then somebody thought quotes looked pretty, so they added U+0027
 APOSTROPHE always.
 
 Then somebody else thought `' looks cooler than '' and made it
 that way.

That _looked_ better on some old fonts/systems.

 Then somebody thought that might make more work when copy and pasting
 when sending that to the shell, and needing to fix it if three clicks
 got the quotes too, so made it back to U+0027 APOSTROPHE. Good.

Ah you mean double clicking to select the word?
Single quotes are generally excluded from that auto selection,
while ‘locale specific’ quotes can be included which _is_ awkward.

Now that's terminal dependent. I notice xterm is more restrictive
in what it auto selects and will exclude the locale quotes (and . too),
while gnome terminal will include the locale quotes.
That's just a bug in gnome terminal though, as it should
add common quoting chars to its delimiter list.

 Except they forget to fix it back for other locales.

As mentioned before, to have it independent of locales we could
use the shell-always quoting style for files.  Note that would have
the small caveat that the quotes would not be included in a double click.

A larger caveat is that it 'shell-always' quoting provides
no protection for the terminal from control chars in a file name.
You can test that out by creating variously named files and using:

  ls -1 --quoting='shell-always' --show-control-chars

Hmm, I wonder could we augment the shell quoting to
add the $'\001' and $'\n' escape formats, which would
both provide the protection and be generally cut and pasteable.

Pádraig.





bug#19605: cp -v vs LC_ALL vs. quote marks

2015-01-15 Thread Pádraig Brady
On 15/01/15 17:28, 積丹尼 Dan Jacobson wrote:
 All I know is in xterm I click three times and all of '...' including
 the quotes gets copied, which is fine with me. Just keep it all 0x27.

Ah right that's an xterm specific feature. See XTerm*on3Clicks here:
http://lukas.zapletalovi.com/2013/07/hidden-gems-of-xterm.html





bug#19604: echo --help does not work

2015-01-15 Thread Pádraig Brady
tag 19604 notabug
close 19604
stop

On 15/01/15 06:20, prateek goyal wrote:
 Hi,
 
 
 when I try to use --help option with echo command, it does not print help 
 contents, but prints --help.
 
 
 prateek@prateek-pc:~/Documents/awk$ echo --help
 --help

You're actually using your shell's echo there.
To use the coreutils one:

  env echo --help

To invoke help for the shell builtin in bash you:

  help echo

Note newer versions of bash will support $builtin --help,
though I've not tested the echo case which would
introduce a change in behavior.

thanks,
Pádraig.






bug#19605: cp -v vs LC_ALL vs. quote marks

2015-01-15 Thread Pádraig Brady
On 15/01/15 11:38, 積丹尼 Dan Jacobson wrote:
 I am glad that these days plain ' is being used instead of goofy `'
 $ LC_ALL=C cp -v /dev/null /tmp/$RANDOM 21
 '/dev/null' - '/tmp/29920'
 
 That way one can not worry about copy and pasting them with the mouse.
 
 The problem is, if I don't use LC_ALL=C then I get the goofy ones, even
 high bit too. Please just use ASCII ', thanks.
 
 # find /mnt/usb/thumb/backups/ -mtime -2 -type f -exec cp -av {} 
 /jidanni_backups/ \;
 ‘/mnt/usb/thumb/backups/root_bkp2015-01-14-10-11-29.bz2’ - 
 ‘/jidanni_backups/root_bkp2015-01-14-10-11-29.bz2’
 
 cp (GNU coreutils) 8.23
 P.S., I bet other coreutils programs do this too.

What's the exact problem with copy/paste?
Are you suggesting that all quoted files
should use shell quoting so that they can be directly
copy/pasted back to a shell. There is some merit in that alright.

Note above you could pass LC_ALL=C with sh -c ...,
or with a separate xargs process, or just directly to find like:
  LC_ALL=C find ... -exec cp ...

Note also that LC_ALL=C isn't ideal for non English users
as you then lose the localized messages, and it isn't enough
to just set LC_CTYPE=C as that just represents the translated
quote in unibyte.

cheers,
Pádraig

p.s. the above command starts a cp process per file.
It would be much more efficient to do:

  find ... -exec cp -av --target=/jidanni_backups/ {} +





bug#19578: Memory leaks in coreutils/lib/locale_charset.c

2015-01-13 Thread Pádraig Brady
tag 19578 notabug
close 19578
stop

On 13/01/15 09:35, Daiki Ueno wrote:
 Zhaopeng Li z...@ustc.edu.cn writes:
 
 At line 534 of coreutils/lib/locale_charset.c, var ‘aliases' points
 to a buffer which is allocated using malloc() .
 
 This buffer is not freed when codeset is still an empty string after
 the loop (Line 534~542).

 So it will be leaked under such situation.
 
 Line 533/* Resolve alias. */
 Line 534   for (aliases = get_charset_aliases ();
 
 I got the same error from clang-analyzer, but I think the leak is
 intentional and harmless.  The return value of get_charset_aliases is
 saved in a global variable charset_aliases and won't be allocated twice.

Thanks Daiki.
Closing for now.






bug#19580: Memory Leak in coreutils/lib/localcharset.c

2015-01-13 Thread Pádraig Brady
forcemerge 19580 19578
stop

On 13/01/15 10:31, Zhaopeng Li wrote:
 At line 221, the assignment (old_res_ptr = res_ptr) will lead to memory leak 
 when iteration of corresponding loop is greater than 3.

Same non issue really.
We don't want to free() here.
I'm not sure how to avoid the warning though?

thanks,
Pádraig.





bug#19570: bug: df and bind mounts

2015-01-12 Thread Pádraig Brady
On 12/01/15 21:27, Bernhard Voelker wrote:
 On 01/12/2015 02:31 AM, Pádraig Brady wrote:
 On 11/01/15 23:36, Vladimir A. Pavlov wrote:
 run /run tmpfs rw,noatime,nodiratime,nodev,noexec,mode=0755,size=1m 0 0
 /run/cgs/httpd /usr/cgs/httpd/run none rw,bind 0 0
 
 Thanks for the analysis and patch,
 Current tests pass at least with it.
 I'll analyse a little more, add tests and probably push.
 
 hmm, tmpfs is problematic anyway, as one can specify anything
 as the dummy backing source device:
 
   $ mount -t tmpfs hello:/world /mnt
   $ mount -t tmpfs something /mnt
   $ mount -t tmpfs / /mnt
   $ findmnt /mnt
   TARGET SOURCE   FSTYPE OPTIONS
   /mnt   hello:/world tmpfs  rw,relatime
   /mnt   somethingtmpfs  rw,relatime
   /mnt   /tmpfs  rw,relatime
 
   $ df -a --out=source,target | grep /mnt
   hello:/world   /mnt
   something  /mnt
   /  /mnt
 
 I'd almost tend to recommend to classify tmpfs as dummy
 file system like procfs etc.

I see what you mean.
However we take dummy to mean,
no associated storage in the memory hierarchy,
which tmpfs clearly has.

Pádraig






bug#19570: bug: df and bind mounts

2015-01-11 Thread Pádraig Brady
On 11/01/15 23:36, Vladimir A. Pavlov wrote:
 Hello,
 
 I have an issue with df (both in version 8.23 and in master branch).
 
 I have tmpfs mounted as /run . There is /run/cgs/httpd subdirectory in
 /run (just a subdirectory, not a tmpfs or another mount). This
 /run/cgs/httpd is bind-mounted to /usr/cgs/httpd/run.
 
 The current algorithm in df.c:filter_mount_list() chooses the bind
 mountpoint since it has the leading slash in the device name
 (/run/cgs/httpd vs run) which is wrong in my setup.
 
 The similar (but not the same) issue is fixed by commit:
 http://git.savannah.gnu.org/cgit/coreutils.git/commit/src/df.c?id=ed1a495b3ccb2665a13229ca866f2115bd768d17
 
 I guess the let real devices with / in the name win replacement branch
 should only be applied if mountpoints are the same as well.
 
 Below is the data to reproduce the bug.
 
 === /etc/mtab (partial) ===
 run /run tmpfs rw,noatime,nodiratime,nodev,noexec,mode=0755,size=1m 0 0
 /run/cgs/httpd /usr/cgs/httpd/run none rw,bind 0 0
 ==
 
 === Real output (git) ===
 Filesystem  Size  Used Avail Use% Mounted on
 /run/cgs/httpd  1.0M  8.0K 1016K   1% /usr/cgs/httpd/run
 ==
 
 === Expected output (with the attached patch applied) ===
 Filesystem  Size  Used Avail Use% Mounted on
 run 1.0M  8.0K 1016K   1% /run
 ==
 

Thanks for the analysis and patch,
Current tests pass at least with it.
I'll analyse a little more, add tests and probably push.

thanks!
Pádraig.





bug#19544: RFE: please fix limited dd output control (want xfer stats, but not blocks).

2015-01-09 Thread Pádraig Brady
On 09/01/15 11:18, Linda Walsh wrote:
 
 The blocks are a bit uninteresting:
 
 7+0 records in
 7+0 records out
 6+0 records in
 
 
 11+0 records out
 8+0 records in
 8+0 records out
 2+0 records in
 ...
 2+0 records out
 15+0 records in
 15+0 records out
 ---
 
 Tells me nothing -- not size of recs, nor time.. nothing interesting.
 
 What I'd rather see:
 
 983040 bytes (983 KB) copied, 0.0135631 s, 72.5 MB/s
 327680 bytes (328 KB) copied, 0.00869602 s, 37.7 MB/s
 393216 bytes (393 KB) copied, 0.00978036 s, 40.2 MB/s
 458752 bytes (459 KB) copied, 0.00906681 s, 50.6 MB/s
 ...
 65536 bytes (66 KB) copied, 0.00843794 s, 7.8 MB/s
 65536 bytes (66 KB) copied, 0.00845365 s, 7.8 MB/s
 983040 bytes (983 KB) copied, 0.0128341 s, 76.6 MB/s
 262144 bytes (262 KB) copied, 0.01019 s, 25.7 MB/s
 262144 bytes (262 KB) copied, 0.00933135 s, 28.1 MB/s
 589824 bytes (590 KB) copied, 0.0124597 s, 47.3 MB/s
 1048576 bytes (1.0 MB) copied, 0.0138104 s, 75.9 MB/s
 ---

There is a new status=progress option that will
output the above format every second, but on
a single updated line.

 (Which, BTW, uses program intelligence to use the same output units as
 the user used for input units, rather than giving them units in an
 unfamiliar dialect).

this has been discussed previously.

thanks,
Pádraig






bug#19530: Solaris 10 df and zfs

2015-01-07 Thread Pádraig Brady
On 07/01/15 22:13, Ted Carr wrote:
 Pádraig,
 
 Here is what I get:
 
 # ./stat -f /  
   File: /
 ID: 4010002  Namelen: 255 Type: zfs
 Block size: 131072 Fundamental block size: 512
 Blocks: Total: 106248371  Free: 83645114   Available: 83645114
 Inodes: Total: 83922472   Free: 83645114

Cool, we're getting 51G from the system.

 
 That said ... Check this out:
 
 Before the patch it was reporting this for root:
 
 Filesystem Size  Used 
 Avail Use% Mounted on
 /platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1   51G   11G   
 40G  22% /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
 
 Which is what I see in the output from SUN df in these two lines (size 
 matches):
 
 /platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1
 51G11G40G22%
 /platform/sun4u-us3/lib/libc_psr.so.1
 /platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
 51G11G40G22%
 /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1

40 + 11 = 51

 Running a SUN df on one of the above FS gives this:
 
 # df -h /platform/sun4u-us3/lib/libc_psr.so.1
 Filesystem size   used  avail capacity  Mounted on
 rpool/ROOT/q414 67G11G40G22%/
 
 # zpool list rpool
 NAME   SIZE  ALLOC   FREE  CAP  HEALTH  ALTROOT
 rpool   68G  26.8G  41.2G  39%  ONLINE  -
 
 Not sure if that helps or not...
 
 I did find this: 
 http://www.c0t0d0s0.org/archives/6168-df-considered-problematic.html.  About 
 half way down the page you will see Digging in the source which may help, 
 or not. ;-)

So Solaris df seems to do further munging of the sizes to
handle deduplication and what not, resulting in a virtual total.
I.E. total != used + avail.
I'm not sure why such details need to be exposed to the user TBH.

We'll keep special handling of file systems like these
under consideration for future releases.

thanks,
Pádraig






bug#19530: Solaris 10 df and zfs

2015-01-07 Thread Pádraig Brady
On 07/01/15 17:00, Ted Carr wrote:
 Hello All,
 
  
 
 I have a requirement for the latest version of coreutils on Solaris 10 SPARC 
 and everything is working as expected with the exception of ‘df’ on ZFS based 
 filesystems… 
 
 It is having an issue correctly displaying the file system mounted on root.
 
 SUN df:
 # /usr/bin/df -hl
 Filesystem size   used  avail capacity  Mounted on
 rpool/ROOT/q414 67G11G40G22%/
 /platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1
 51G11G40G22%
 /platform/sun4u-us3/lib/libc_psr.so.1
 /platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
 51G11G40G22%
 /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
 
 GNU df:
 # /var/tmp/coreutils-8.23/src/df -hl
 
 Filesystem Size  Used 
 Avail Use% Mounted on
 /platform/.../libc_psr_hwcap1.so.1   51G   11G   40G  22% 
 /platform/.../sparcv9/libc_psr.so.1

Could you show the output from df -a -hl

 If I force it to just display / I see:
 
 # /var/tmp/coreutils-8.23/src/df -h /
 Filesystem   Size  Used Avail Use% Mounted on
 rpool/ROOT/q414   51G   11G   40G  22% /

There have been a few changes to df since 8.23.
Could you patch df.c with this (or replace with this version) and recompile?
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=ed1a495b

thanks,
Pádraig





bug#19520: shuf: extra operand handling with -i

2015-01-05 Thread Pádraig Brady
On 06/01/15 03:42, Paul Eggert wrote:
 Thanks for reporting that.  I installed the attached patch, which has a test 
 for 
 the bug.

It's probably worth mentioning the crash in NEWS
and augmenting the test to detect crashes in other situations,
which I've done in the attached.

thanks,
Pádraig.
From 2c884ee394c0a6419c9b517e79d42df3425ef2be Mon Sep 17 00:00:00 2001
From: Daiki Ueno u...@gnu.org
Date: Tue, 6 Jan 2015 03:36:57 +
Subject: [PATCH] maint: adjustments related to previous shuf crash fix

* tests/misc/shuf.sh: Improve the test so it detects
crashes in more cases.
* NEWS: Mention the previous fix.
---
 NEWS   |  3 +++
 tests/misc/shuf.sh | 17 +
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index b81154d..f59bfc1 100644
--- a/NEWS
+++ b/NEWS
@@ -32,6 +32,9 @@ GNU coreutils NEWS-*- outline -*-
   rm indicates the correct number of arguments in its confirmation prompt,
   on all platforms.  [bug introduced in coreutils-8.22]
 
+  shuf -i with a single redundant operand, would crash instead of issuing
+  a diagnostic.  [bug introduced in coreutils-8.22]
+
 ** New features
 
   chroot accepts the new --skip-chdir option to not change the working directory
diff --git a/tests/misc/shuf.sh b/tests/misc/shuf.sh
index 5e85d9a..34f4225 100755
--- a/tests/misc/shuf.sh
+++ b/tests/misc/shuf.sh
@@ -47,7 +47,8 @@ test $t = 'a b c d e' || { fail=1; echo not a permutation 12; }
 shuf -er
 test $? -eq 1 || fail=1
 
-# coreutils-8.23 dumps core.
+# coreutils-8.22 and 8.23 dump core
+# with a single redundant operand with --input-range
 shuf -i0-0 1
 test $? -eq 1 || fail=1
 
@@ -70,7 +71,7 @@ touch unreadable || framework_failure_
 chmod 0 unreadable || framework_failure_
 if ! test -r unreadable; then
   shuf -n0 unreadable || fail=1
-  shuf -n1 unreadable  fail=1
+  { shuf -n1 unreadable || test $? -ne 1; }  fail=1
 fi
 
 # Multiple -n is accepted, should use the smallest value
@@ -81,25 +82,25 @@ test $c -eq 3 || { fail=1; echo Multiple -n failed2 ; }
 # Test error conditions
 
 # -i and -e must not be used together
-: | shuf -i0-9 -e A B 
+: | { shuf -i0-9 -e A B || test $? -ne 1; } 
   { fail=1; echo shuf did not detect erroneous -e and -i usage.2 ; }
 # Test invalid value for -n
-: | shuf -nA 
+: | { shuf -nA || test $? -ne 1; } 
   { fail=1; echo shuf did not detect erroneous -n usage.2 ; }
 # Test multiple -i
-shuf -i0-9 -n10 -i8-90 
+{ shuf -i0-9 -n10 -i8-90 || test $? -ne 1; } 
   { fail=1; echo shuf did not detect multiple -i usage.2 ; }
 # Test invalid range
 for ARG in '1' 'A' '1-' '1-A'; do
-  shuf -i$ARG 
+{ shuf -i$ARG || test $? -ne 1; } 
 { fail=1; echo shuf did not detect erroneous -i$ARG usage.2 ; }
 done
 
 # multiple -o are forbidden
-shuf -i0-9 -o A -o B 
+{ shuf -i0-9 -o A -o B || test $? -ne 1; } 
   { fail=1; echo shuf did not detect erroneous multiple -o usage.2 ; }
 # multiple random-sources are forbidden
-shuf -i0-9 --random-source A --random-source B 
+{ shuf -i0-9 --random-source A --random-source B || test $? -ne 1; } 
   { fail=1; echo shuf did not detect multiple --random-source usage.2 ; }
 
 # Test --repeat option
-- 
2.1.0



bug#19503: most translations of proper names aren't being used

2015-01-04 Thread Pádraig Brady
On 04/01/15 16:50, Jim Meyering wrote:
 On Sun, Jan 4, 2015 at 7:53 AM, Pádraig Brady p...@draigbrady.com wrote:
 ...
 Also there is the more general point about how correct
 it is to attribute a program to author(s) in any case,
 as that tracked to a much more accurate level of detail
 by git blame etc.  Should we be removing output of
 author names at runtime completely?
 
 We cannot do that blindly, since we lack version control history
 from before 1992, which would make it appear that David J. MacKenzie
 (who wrote many of these tools from scratch) contributed nothing.

Well we'd still leave the  /* Written by ... */ comments at the start.
I'm just not convinced of the need for attribution at runtime,
given that it's inaccurate and awkward to represent.

BTW, it might have been nice to have the initial git commits
for these tools attributed to the original author. Hindsight and all that :)
Also I was wondering recently about the origins of some of this code,
and thought it might be useful to have a repo with commits per
release, which could be obtained from various old tar balls.
I did notice a few pre 1992 tarballs. I wonder what the best
source of these would be.

cheers,
Pádraig.





bug#19228: Challenging output from help2man split

2015-01-01 Thread Pádraig Brady
On 30/11/14 17:33, Pádraig Brady wrote:
 On 30/11/14 17:06, Kevin O'Gorman wrote:
 This is sent to the bug addresses for both help2man and split because I'm 
 not sure where the fault lies.  Both are GNU projects, so I hope you can 
 cooperate and figure it out.

 In any event, the output of help2man split loses some line breaks and 
 makes an itemized list somewhat hard to understand.  This is the portion 
 that begins CHUNKS may be:

 I'm an Xubuntu user, and this output is the man page we get.  I expect the 
 same is true of some or all debian-based distros, and perhaps others as well.
 
 Yes we should improve this. with the following simple patch
 we get better output in the man page,

I've now pushed that and marking this as done.

 though there are extraneous newlines then

I'm not sure if tighter formatting is possible
or appropriate for man pages.

thanks,
Pádraig.






bug#19476: Poor output from help2man split

2014-12-31 Thread Pádraig Brady
forcemerge 19228 19476
stop






bug#19228: bug#19476: Poor output from help2man split

2014-12-31 Thread Pádraig Brady
On 31/12/14 15:32, Kevin O'Gorman wrote:
 
 
 On Wed, Dec 31, 2014 at 4:06 AM, Pádraig Brady p...@draigbrady.com 
 mailto:p...@draigbrady.com wrote:
 
 forcemerge 19228 19476
 stop
 
 
 I have no clue what that means.

Just merging with the same issue you raised a month ago.
I'll have it improved for the upcoming release.

thanks,
Pádraig






bug#19456: GNU coreutils - touch / add -v, --verbose option

2014-12-28 Thread Pádraig Brady
On 28/12/14 08:33, Jari Aalto wrote:
 
 It would be nice to see progress of touched files. Please
 add option[1]:
 
   -v, --verbose
 
 Jari
 
 [1] Not included in touch(1), GNU coreutils 8.23

Maybe. What's your use case exactly.
In other tools that have --verbose output,
there is the opportunity to distinguish operations,
or identify when possible long running operations are finished.
Neither is the case though for touch.

If you just wanted to see each block of files
that touch is processing, perhaps using tee would suffice like?

  find ... | tee /dev/tty | xargs touch ...

cheers,
Pádraig.






bug#19447: chmod - problem

2014-12-26 Thread Pádraig Brady
tag 19447 notabug
close 19447
stop

On 26/12/14 18:28, Tom wrote:
 Hi
 
 chmod does not work recursively. The command
 
 chmod --recursive --verbose a-x ./*.txt
 
 only has effects in the actual working directory, but not in the 
 subdirectories.

You're passing only .txt files to chmod here
so the --recursive option in ineffective here.
You probably want something like:

  find . -name '*.txt' -print0 | xargs -r0 chmod --verbose a-x

cheers,
Pádraig





bug#19375: closed (Re: bug#19377: bug#19378: [PATCH 3/4] cat, chcon, chgrp, chmod, chown, cp, du, head: support wildcards on OS/2)

2014-12-18 Thread Pádraig Brady
reopen 19377
stop

On 19/12/14 01:13, KO Myung-Hun wrote:
 
 GNU bug Tracking System wrote:
 Your bug report

 #19377: [PATCH 1/4] doc: add $(EXEEXT) suffix to the executables

 which was filed against the coreutils package, has been closed.

 The explanation is attached below, along with your original report.
 If you require more details, please reply to 19...@debbugs.gnu.org.


 
 #19375 was closed without applied. Any problem ?

Jim wasn't aware I'd merged all these bugs to 19377
(which was the top level summary mail), as having
separate bugs per patch was confusing and overkill.

Anyway don't worry, we'll apply all that's appropriate.
I've just now applied this particular patch as it's fine.

thanks!
Pádraig.





bug#19377: bug#19378: [PATCH 3/4] cat, chcon, chgrp, chmod, chown, cp, du, head: support wildcards on OS/2

2014-12-15 Thread Pádraig Brady
On 15/12/14 01:15, KO Myung-Hun wrote:
 
 
 Pádraig Brady wrote:
 forcemerge 19378 19377
 stop

 On 14/12/14 03:47, KO Myung-Hun wrote:
 And ln,ls,mv,rm,tail.

 * src/cat.c (main): Expand wildcards on OS/2.
 * src/chcon.c (main): Likewise.
 * src/chgrp.c (main): Likewise.
 * src/chmod.c (main): Likewise.
 * src/chown.c (main): Likewise.
 * src/cp.c (main): Likewise.
 * src/du.c (main): Likewise.
 * src/head.c (main): Likewise.
 * src/ln.c (main): Likewise.
 * src/ls.c (main): Likewise.
 * src/mv.c (main): Likewise.
 * src/rm.c (main): Likewise.
 * src/tail.c (main): Likewise.

 Patches from coreutils 8.8 by Paul Smedley.

 diff --git a/src/cat.c b/src/cat.c
 index c7bb7e1..0138114 100644
 --- a/src/cat.c
 +++ b/src/cat.c
 @@ -544,6 +544,10 @@ main (int argc, char **argv)
bool show_tabs = false;
int file_open_mode = O_RDONLY;
  
 +#ifdef __OS2__
 +  _wildcard (argc, argv);
 +#endif
 +

 Interesing, the OS/2 shell doesn't doe the globbing.
 
 Ported unixy shells(sh) support it, but OS/2 default shell(CMD) does not.
 
 I'm wondering about the scalability of this.
 Are there any facilities for dealing with arbitrary numbers
 of files, like with xargs for example?
 
 No. It always processes all files.
 
 What are the practical limits of the number of files?
 
 It's up to a free memory.
 
 Does _wildcard() exit with an error in this case?

 
 Call exit(255) with printing an error message.
 

While the adjustment is small, it would be better to avoid the ifdef in all 
programs.
I think there is a -Zwildcard option to auto enable for all programs?
Also is there an option to disable this expansion at runtime
(which should be documented if available).
For example to allow deleting a file called '*', which seems like a more likely
occurrence on this platform.

thanks,
Pádraig.





bug#19375: [PATCH 1/4] doc: add $(EXEEXT) suffix to the executables

2014-12-14 Thread Pádraig Brady
foremerge 19375 19377
stop

On 14/12/14 03:47, KO Myung-Hun wrote:
 * man/local.mk: Add $(EXEEXT) suffix to the executables.

LGTM.

thanks,
Pádraig.






bug#19374: [PATCH 2/4] build: configure.ac: support a response file on OS/2

2014-12-14 Thread Pádraig Brady
forcemerge 19374 19377
stop

On 14/12/14 03:47, KO Myung-Hun wrote:
 * configure.ac (LDFLAGS): Add -Zargs-resp on os2*.

This imparts no information.  Please comment as to the why
rather than the what.

 ---
  configure.ac | 3 +++
  1 file changed, 3 insertions(+)
 
 diff --git a/configure.ac b/configure.ac
 index 0744964..7cb1085 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -589,6 +589,9 @@ AM_GNU_GETTEXT_VERSION([0.18.1])
  # For a test of uniq: it uses the $LOCALE_FR envvar.
  gt_LOCALE_FR
  
 +# In order to support a response file on OS/2
 +AS_CASE([$host_os], [os2*], [LDFLAGS=$LDFLAGS -Zargs-resp])

This relies on an implicit AC_CANONICAL_HOST which seems brittle.

thanks,
Pádraig





bug#19377: bug#19378: [PATCH 3/4] cat, chcon, chgrp, chmod, chown, cp, du, head: support wildcards on OS/2

2014-12-14 Thread Pádraig Brady
forcemerge 19378 19377
stop

On 14/12/14 03:47, KO Myung-Hun wrote:
 And ln,ls,mv,rm,tail.
 
 * src/cat.c (main): Expand wildcards on OS/2.
 * src/chcon.c (main): Likewise.
 * src/chgrp.c (main): Likewise.
 * src/chmod.c (main): Likewise.
 * src/chown.c (main): Likewise.
 * src/cp.c (main): Likewise.
 * src/du.c (main): Likewise.
 * src/head.c (main): Likewise.
 * src/ln.c (main): Likewise.
 * src/ls.c (main): Likewise.
 * src/mv.c (main): Likewise.
 * src/rm.c (main): Likewise.
 * src/tail.c (main): Likewise.
 
 Patches from coreutils 8.8 by Paul Smedley.

 diff --git a/src/cat.c b/src/cat.c
 index c7bb7e1..0138114 100644
 --- a/src/cat.c
 +++ b/src/cat.c
 @@ -544,6 +544,10 @@ main (int argc, char **argv)
bool show_tabs = false;
int file_open_mode = O_RDONLY;
  
 +#ifdef __OS2__
 +  _wildcard (argc, argv);
 +#endif
 +

Interesing, the OS/2 shell doesn't doe the globbing.
I'm wondering about the scalability of this.
Are there any facilities for dealing with arbitrary numbers
of files, like with xargs for example?
What are the practical limits of the number of files?
Does _wildcard() exit with an error in this case?

thanks,
Pádraig






bug#19377: bug#19376: [PATCH 4/4] build: use -pi.bak instead of -pi

2014-12-14 Thread Pádraig Brady
forcemerge 19376 19377
stop

On 14/12/14 03:47, KO Myung-Hun wrote:
 This fixes the following error.
 
 -
 Can't do inplace edit without backup.
 -
 
 * Makefile.am (dist-hook): Use -pi.bak instead of -pi.
 * bootstrap.conf (bootstrap_epilogue): Likewise.
 ---
  Makefile.am| 2 +-
  bootstrap.conf | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/Makefile.am b/Makefile.am
 index fb4af27..371eb59 100644
 --- a/Makefile.am
 +++ b/Makefile.am
 @@ -105,7 +105,7 @@ BUILT_SOURCES = .version
  # See the rm_subst comment for details.
  dist-hook: gen-ChangeLog
   $(AM_V_GEN)echo $(VERSION)  $(distdir)/.tarball-version
 - $(AM_V_at)perl -pi -e '$(rm_subst)' $(distdir)/Makefile.in
 + $(AM_V_at)perl -pi.bak -e '$(rm_subst)' $(distdir)/Makefile.in
  
  gen_start_date = 2008-02-08
  .PHONY: gen-ChangeLog
 diff --git a/bootstrap.conf b/bootstrap.conf
 index c0b5f02..0baf455 100644
 --- a/bootstrap.conf
 +++ b/bootstrap.conf
 @@ -366,7 +366,7 @@ bootstrap_epilogue()
# Why?  That pipeline searches all files in $(top_srcdir), and if you
# happen to have large files (or apparently large sparse files), the
# first grep may well run out of memory.
 -  perl -pi -e 's/if LC_ALL=C grep .GNU .PACKAGE.*; then/if true; then/' \
 +  perl -pi.bak -e 's/if LC_ALL=C grep .GNU .PACKAGE.*; then/if true; then/' \
  po/Makefile.in.in
  
# Install our git hooks, as long as cp accepts the --backup option,

This will leave .bak files in place on all platforms which isn't ideal.
Pity `perl -i` doesn't handle the platform differences transparently.
Does sed -i behave better. That's less portable though could be tried
and then fall back to perl -i.

thanks,
Pádraig.






bug#19305: Integrate dd wrapper progress-dd with coreutils

2014-12-08 Thread Pádraig Brady
On 08/12/14 14:29, Sebastian Pipping wrote:
 Hi there,
 
 
 I would like to see dd wrapper progress-dd into coreutils.
 All it does is to keep sending USR1 to a child dd process to increase
 usability of that dd feature.
 It's up here:
 
 http://git.goodpoint.de/?p=progress-dd.git;a=blob;f=progress-dd
 
 I'm happy to fix any potential issues about it if your interested in
 general.

In general reusing existing tools with wrappers is a useful technique,
and often preferred over adding options to the tools themselves.
Though in this case we thought it more appropriate to add the status=progress
option to dd itself, as that has the added advantage of updating a single line:

  http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=af2a4ed2

Note also the following related commit which make the handling of SIGUSR1
free from races. See the example in coreutils.texi in the following
to avoid the startup races:

  http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=27d2c738

The above two patches will be in the next coreutils release.

thanks,
Pádraig.





bug#19238: Fully fix du circular warning on bind mounts

2014-12-01 Thread Pádraig Brady
On 01/12/14 08:54, Boris Ranto wrote:
 The du circular warning can still be hit even though a file system is in
 good condition. All we need to do is to get the message is to begin
 traversing the file system between the bind mount source and bind mount
 target directories, i.e this short script reproduces the problem:
 
 # mkdir -p a/b/c
 # mount -o bind a a/b/c
 # du a/b
 
 The problem is that in this case, the first directory that is detected
 by fts as a duplicate directory is directory a/b/c/b which is not a
 mount point.
 
 The solution is to traverse the structure all the way to a/b (excluding
 a/b) which is detected as the base of the cycle and look up all these
 directories in the mount table.
 
 I'm attaching the patch that fixed this problem for me.

Very nice. Thanks for the test!
I'll add a NEWS entry and push later.

thanks!
Pádraig






bug#19184: coreutils-8.23 Compile Error on Solaris 8 - Sun Studio 11

2014-12-01 Thread Pádraig Brady
On 01/12/14 13:58, Ted Carr wrote:
 Hi All,
 
 Sorry for the delay...
 
 Requested output:
 
 bash-2.03$ grep gl_cv_func_printf_sizes_c99 config.log
 gl_cv_func_printf_sizes_c99=no

Thanks for the confirmation.
Your issue should be address in the upcoming 8.24 release
that contains these changes:

http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=a78d8538
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=7d1fe886
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=458c5cbc

cheers,
Pádraig.





bug#19240: cut 8.22 adds newline

2014-12-01 Thread Pádraig Brady
On 01/12/14 21:18, Eric Blake wrote:
 [re-adding the bug, with permission]
 
 On 12/01/2014 01:10 PM, John Kendall wrote:
 Thanks, Eric.

 My only, admittedly weak, rebuttal is that the behavior of sort might not 
 be the best behavior to imitate.  It's understandable why POSIX defines 
 how sort behaves, since it's intended for multi-line input.

 It seems sed, which is frequently used for single lines of input, might be 
 a better analogy.  Gnu sed 4.2.2 and solaris sed act the same way as 
 solaris cut (no newline added):

 $ printf ooo | sed 's/o/p/g'
 ppp$
 
 As a counter-argument, I recall hearing of other implementations of sed
 that silently omit a trailing line that lacks a newline.  And perhaps
 GNU sed should be changed to always emit a trailing newline, but that's
 something to bring up on the sed mailing list :)

I don't think so.
I agree that a newline should only be added where needed,
especially with a low level tool like sed.

sort can reorder the last item elsewhere in the output
and so needs to output the extra '\n'.

BTW the argument that it's not a text file is a bit beside the point
as POSIX also says text files can't contain NUL chars, but we process
this just fine:

  $ printf 'a\000b' | cut -c3
  b

 If my weak rebuttal is unconvincing, then I wonder if a note could be 
 added to the cut man page so that the next porter can find an answer 
 a little easier.   As an interesting counterpoint, the Solaris version of
 sort announces loudly when it does what POSIX requires:

 $ printf ooo | sort
 sort: missing NEWLINE added at end of input file STDIN
 ooo
 $
 
 Ouch - that's a bug in Solaris.  POSIX does not allow for noise on
 stderr when giving a default 0 success exit status.
 



 Thanks for taking the time to clarify this.  I've been using SunOS and 
 Solaris exclusively since 1992, so I've had a stable environment and 
 was oblivious to the unspecified behavior that my scripts depended on.  

 Cheers,
 John

 
 I'll leave it to other contributors to weigh in on whether omitting the
 final newline on output when it was missing on input is worth the
 complexity of a change.

Our current behaviour wrt newlines is documented at:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=tests/misc/cut.pl;h=04188621b#l132
though those tests were only added in v8.21

Note I see that solaris is inconsistent with -c and -f in this regard:

solaris printf '1\n2' | cut -c1
1
2solaris

solaris printf '1\n2' | cut -f1
1
2
solaris

I kid you not that FreeBSD does the opposite and outputs
the extra '\n' in the -c case but not with -f.

Also comparing other tools like uniq we have:

solaris printf '1' | uniq
solaris (nothing output!)

freebsd printf '1' | uniq
1freebsd

coreutl printf '1' | uniq
1
coreutl


If we were just implementing now, I'd not output the extra '\n',
but changing at this stage needs to be carefully considered,
and with all the textutils, not just cut(1).

thanks,
Pádraig.





bug#19240: cut 8.22 adds newline

2014-12-01 Thread Pádraig Brady
On 01/12/14 23:06, Eric Blake wrote:
 On 12/01/2014 03:06 PM, Pádraig Brady wrote:
 
 BTW the argument that it's not a text file is a bit beside the point
 as POSIX also says text files can't contain NUL chars, but we process
 this just fine:

   $ printf 'a\000b' | cut -c3
   b
 
 The fact that GNU offers an extension where we gracefully handle NUL
 bytes is a bonus of GNU, and does not change the fact that POSIX already
 says we are in unspecified territory and can do whatever we deem most
 useful.  I suspect that in multibyte locales with non-character encoding
 errors, the behavior becomes harder to pinpoint on what makes the most
 sense - but again, that is another aspect that makes a file binary
 rather than text and therefore falls under unspecified behavior.
 
 
 Also comparing other tools like uniq we have:

 solaris printf '1' | uniq
 solaris (nothing output!)

 freebsd printf '1' | uniq
 1freebsd

 coreutl printf '1' | uniq
 1
 coreutl
 
 What about:
 printf '1\n1' | uniq

Both solaris and FreeBSD behave like GNU with that input.

 GNU treats the two lines as identical (and thus supplied a missing \n on
 the second line); but I don't have ready access to test the other two as
 I type this.
 
 If we were just implementing now, I'd not output the extra '\n',
 but changing at this stage needs to be carefully considered,
 and with all the textutils, not just cut(1).
 
 I tend to go the opposite - producing text output, even on non-text
 input, is more likely to be useful when piping files to other utilities
 that don't handle non-text files as gracefully as the coreutils.  But I
 definitely agree that it is not something we change lightly.
 

cheers,
Pádraig.





bug#19228: Challenging output from help2man split

2014-11-30 Thread Pádraig Brady
On 30/11/14 17:06, Kevin O'Gorman wrote:
 This is sent to the bug addresses for both help2man and split because I'm not 
 sure where the fault lies.  Both are GNU projects, so I hope you can 
 cooperate and figure it out.
 
 In any event, the output of help2man split loses some line breaks and makes 
 an itemized list somewhat hard to understand.  This is the portion that 
 begins CHUNKS may be:
 
 I'm an Xubuntu user, and this output is the man page we get.  I expect the 
 same is true of some or all debian-based distros, and perhaps others as well.

Yes we should improve this. with the following simple patch
we get better output in the man page, though there are
extraneous newlines then. I'll look at improving further.

thanks,
Pádraig.

diff --git a/src/split.c b/src/split.c
index 0eec3ec..0057267 100644
--- a/src/split.c
+++ b/src/split.c
@@ -234,12 +234,12 @@ is -, read standard input.\n\
   emit_size_note ();
   fputs (_(\n\
 CHUNKS may be:\n\
-N   split into N files based on size of input\n\
-K/N output Kth of N to stdout\n\
-l/N split into N files without splitting lines\n\
-l/K/N   output Kth of N to stdout without splitting lines\n\
-r/N like 'l' but use round robin distribution\n\
-r/K/N   likewise but only output Kth of N to stdout\n\
+ N   split into N files based on size of input\n\
+ K/N output Kth of N to stdout\n\
+ l/N split into N files without splitting lines\n\
+ l/K/N   output Kth of N to stdout without splitting lines\n\
+ r/N like 'l' but use round robin distribution\n\
+ r/K/N   likewise but only output Kth of N to stdout\n\
 ), stdout);
   emit_ancillary_info (PROGRAM_NAME);
 }






bug#18499: Possible mv race for hardlinks (rhbz #1141368 )

2014-11-29 Thread Pádraig Brady
On 21/11/14 03:30, Pádraig Brady wrote:
 We want to leave the logic in place for cp and install though,
 and I've adjusted your patch accordingly. I've also adjusted
 the tests to pass and augmented the tests to cover one of
 the cases missed in the previous patch.  I'll push this tomorrow.

There was a small window where attachments were being stripped
from gnu mailing list emails, that the above hit unfortunately.
So for the record, the patch referenced above is:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=222d7ac0

There was a problem with that though, identified by the darwin job
added to the hydra continuous integration system today:
http://hydra.nixos.org/job/gnu/coreutils-master/build.x86_64-darwin

I pushed the attached patch to address that.

thanks,
Pádraig.
From 6f16c63963b0624cbcbf285fb936b79276c047de Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Sat, 29 Nov 2014 22:42:04 +
Subject: [PATCH] tests: avoid hardlink to symlink tests where not supported

These checks weren't correctly avoided in commit v8.23-66-g222d7ac

* tests/cp/same-file.sh: Avoid all hardlink to symlink tests
on platforms where that's not supported.
Identified by http://hydra.nixos.org/build/17636446
---
 tests/cp/same-file.sh | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tests/cp/same-file.sh b/tests/cp/same-file.sh
index 54d23a5..242c54b 100755
--- a/tests/cp/same-file.sh
+++ b/tests/cp/same-file.sh
@@ -36,7 +36,7 @@ ln dangling-slink hard-link  /dev/null 21 \
 rm -f no-such dangling-slink hard-link
 
 test $hard_link_to_symlink_does_the_deref = yes \
- remove_these_sed='/^0 -[bf]*l .*sl1 -/d' \
+ remove_these_sed='/^0 -[bf]*l .*sl1 -/d; /hlsl/d' \
 || remove_these_sed='/^ELIDE NO TEST OUTPUT/d'
 
 exec 31 1 actual
@@ -71,11 +71,13 @@ for args in 'foo symlink' 'symlink foo' 'foo foo' 'sl1 sl2' \
 # cont'd  Instead, skip them only on systems for which link does
 # dereference a symlink.  Detect and skip such tests here.
 case $hard_link_to_symlink_does_the_deref:$args:$options in
-  yes:*sl2:-fl)
+  'yes:sl1 sl2:-fl')
 continue ;;
-  yes:*sl2:-bl)
+  'yes:sl1 sl2:-bl')
 continue ;;
-  yes:*sl2:-bfl)
+  'yes:sl1 sl2:-bfl')
+continue ;;
+  yes:hlsl*)
 continue ;;
 esac
 
-- 
2.1.0



bug#19218: Inconsistent spacing of output of ls --full-time [file argument]

2014-11-29 Thread Pádraig Brady
tag 19218 notabug
close 19218
stop

On 29/11/14 21:14, Paul Eggert wrote:
 I don't see a bug in the cases you mention.  First, 'ls' dynamically adjusts 
 column widths to fit the data, and this is considered to be a feature.

Right. This is a limitation of cut, rather than anything wrong with ls.
See the awk usage at http://www.gnu.org/software/coreutils/cut

These examples might help:

  ls --full-time | tr -s ' ' | cut -d' ' -f6-
  ls --full-time | awk '{ print substr($0, index($0,$6)) }'

 Second, different platforms have different time stamp resolutions.  The idea 
 that
 all dates should use the same width is doomed anyway, since file time stamps 
 can 
 exceed the year :
 
 $ touch -d'1-01-01 00:00:00' far-in-future
 $ touch now
 $ ls -l --full-time
 -rw-r--r-- 1 eggert eggert 0 1-01-01 00:00:00.0 -0800 
 far-in-future
 -rw-r--r-- 1 eggert eggert 0 2014-11-29 13:07:55.182466680 -0800 now
 
 Arguably this last example *is* a bug in 'ls', as dates should line up even 
 when 
 they're outlandish.  But it's not likely to be a bug one runs into with real 
 files, at least, not for another 7985 years or so.

:)

thanks,
Pádraig.






bug#19184: coreutils-8.23 Compile Error on Solaris 8 - Sun Studio 11

2014-11-27 Thread Pádraig Brady
On 27/11/14 02:35, Paul Eggert wrote:
 Pádraig Brady wrote:
 I did that in the attached.
 
 I also see uses of %z in dd.c, od.c, and split.c

No other uses with fprintf() though.
The od one goes through printf, but that's only debugging.
The error() ones are fine as they go through any printf() replacement.
Hmm, you're right, that dependency is not explicit,
so it's a bit brittle to leave those.
I wonder should the gnulib errno module depend on the [v]fprintf-posix modules?

Anyway it's better to be consistent and remove the remaining
few uses of %zu for now, so I've just pushed a patch to do that.

thanks,
Pádraig.





bug#19184: coreutils-8.23 Compile Error on Solaris 8 - Sun Studio 11

2014-11-26 Thread Pádraig Brady
On 26/11/14 19:21, Paul Eggert wrote:
 On 11/26/2014 06:19 AM, Ted Carr wrote:
 I get some errors when running 'make check'.  You have any time to help with 
 those or should I just send in another email?
 Can't hurt to email them to bug-coreutils@gnu.org.  Can't promise anything: 
 Solaris 8 isn't supported any more by Oracle, so it's mostly off our radar.  
 Anyway, I'm marking this bug as done.

Ted emailed me privately by mistake.

There 2 issues. The first was a weird one with the shell's printf
which the attached patch will hopefully avoid.

The other issue is incorrect prompts output by rm due to its use of %zu
Ted could you give the output of:

  grep gl_cv_func_printf_sizes_c99 config.log

Now coreutils doesn't make use of the fprintf-posix module
to act on the above test.
Jim attempted it for a day in 2007 but reverted due to too many test failures.
We could easily avoid the use of %zu I suppose in coreutils,
as there are only a few uses really.

thanks,
Pádraig.
From a78d85386bf4a55d7ccbd7c03c0075615b3f61d2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Wed, 26 Nov 2014 20:15:15 +
Subject: [PATCH] tests: fix portability issue in dd/ascii test

Solaris 8 was seen to issue this error:
printf: `': illegal format character

* test/dd/ascii.sh: Use the coreutils printf in this test
rather than the system one, to avoid portability issues.
---
 tests/dd/ascii.sh | 71 ---
 1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/tests/dd/ascii.sh b/tests/dd/ascii.sh
index 7dc39cc..98a57a1 100755
--- a/tests/dd/ascii.sh
+++ b/tests/dd/ascii.sh
@@ -21,45 +21,46 @@ print_ver_ dd
 
 {
   # Two lines, EBCDIC  A A and  A  , followed by all the bytes in order.
-  printf '\100\301\100\301\100\301\100\100' 
-  printf $(printf '\\%03o' $(seq 0 255));
+  env printf '\100\301\100\301\100\301\100\100' 
+  env printf $(env printf '\\%03o' $(seq 0 255));
 } in || framework_failure_
 
 {
   # The converted lines, with trailing spaces removed.
-  printf ' A A\n A\n' 
-  printf '\000\001\002\003\n\234\011\206\177\n' 
-  printf '\227\215\216\013\n\014\015\016\017\n' 
-  printf '\020\021\022\023\n\235\205\010\207\n' 
-  printf '\030\031\222\217\n\034\035\036\037\n' 
-  printf '\200\201\202\203\n\204\012\027\033\n' 
-  printf '\210\211\212\213\n\214\005\006\007\n' 
-  printf '\220\221\026\223\n\224\225\226\004\n' 
-  printf '\230\231\232\233\n\024\025\236\032\n' 
-  printf '\040\240\241\242\n\243\244\245\246\n' 
-  printf '\247\250\325\056\n\074\050\053\174\n' 
-  printf '\046\251\252\253\n\254\255\256\257\n' 
-  printf '\260\261\041\044\n\052\051\073\176\n' 
-  printf '\055\057\262\263\n\264\265\266\267\n' 
-  printf '\270\271\313\054\n\045\137\076\077\n' 
-  printf '\272\273\274\275\n\276\277\300\301\n' 
-  printf '\302\140\072\043\n\100\047\075\042\n' 
-  printf '\303\141\142\143\n\144\145\146\147\n' 
-  printf '\150\151\304\305\n\306\307\310\311\n' 
-  printf '\312\152\153\154\n\155\156\157\160\n' 
-  printf '\161\162\136\314\n\315\316\317\320\n' 
-  printf '\321\345\163\164\n\165\166\167\170\n' 
-  printf '\171\172\322\323\n\324\133\326\327\n' 
-  printf '\330\331\332\333\n\334\335\336\337\n' 
-  printf '\340\341\342\343\n\344\135\346\347\n' 
-  printf '\173\101\102\103\n\104\105\106\107\n' 
-  printf '\110\111\350\351\n\352\353\354\355\n' 
-  printf '\175\112\113\114\n\115\116\117\120\n' 
-  printf '\121\122\356\357\n\360\361\362\363\n' 
-  printf '\134\237\123\124\n\125\126\127\130\n' 
-  printf '\131\132\364\365\n\366\367\370\371\n' 
-  printf '\060\061\062\063\n\064\065\066\067\n' 
-  printf '\070\071\372\373\n\374\375\376\377\n';
+env printf \
+' A A\n A\n'\
+'\000\001\002\003\n\234\011\206\177\n'\
+'\227\215\216\013\n\014\015\016\017\n'\
+'\020\021\022\023\n\235\205\010\207\n'\
+'\030\031\222\217\n\034\035\036\037\n'\
+'\200\201\202\203\n\204\012\027\033\n'\
+'\210\211\212\213\n\214\005\006\007\n'\
+'\220\221\026\223\n\224\225\226\004\n'\
+'\230\231\232\233\n\024\025\236\032\n'\
+'\040\240\241\242\n\243\244\245\246\n'\
+'\247\250\325\056\n\074\050\053\174\n'\
+'\046\251\252\253\n\254\255\256\257\n'\
+'\260\261\041\044\n\052\051\073\176\n'\
+'\055\057\262\263\n\264\265\266\267\n'\
+'\270\271\313\054\n\045\137\076\077\n'\
+'\272\273\274\275\n\276\277\300\301\n'\
+'\302\140\072\043\n\100\047\075\042\n'\
+'\303\141\142\143\n\144\145\146\147\n'\
+'\150\151\304\305\n\306\307\310\311\n'\
+'\312\152\153\154\n\155\156\157\160\n'\
+'\161\162\136\314\n\315\316\317\320\n'\
+'\321\345\163\164\n\165\166\167\170\n'\
+'\171\172\322\323\n\324\133\326\327\n'\
+'\330\331\332\333\n\334\335\336\337\n'\
+'\340\341\342\343\n\344\135\346\347\n'\
+'\173\101\102\103\n\104\105\106\107\n'\
+'\110\111\350\351\n\352\353\354\355\n'\
+'\175\112\113\114\n\115\116\117\120\n'\
+'\121\122\356\357\n\360\361\362\363\n'\
+'\134\237\123\124\n\125\126\127\130\n'\
+'\131\132\364\365\n\366\367\370\371\n'\
+'\060\061\062\063\n\064\065\066\067\n'\

bug#19184: coreutils-8.23 Compile Error on Solaris 8 - Sun Studio 11

2014-11-26 Thread Pádraig Brady
On 26/11/14 20:37, Pádraig Brady wrote:
 The other issue is incorrect prompts output by rm due to its use of %zu
 Ted could you give the output of:
 
   grep gl_cv_func_printf_sizes_c99 config.log
 
 Now coreutils doesn't make use of the fprintf-posix module
 to act on the above test.
 Jim attempted it for a day in 2007 but reverted due to too many test 
 failures.
 We could easily avoid the use of %zu I suppose in coreutils,
 as there are only a few uses really.

I did that in the attached.

thanks,
Pádraig
From 78202c7118378cf1886f46887412dbf49ce3a1cd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= p...@draigbrady.com
Date: Thu, 27 Nov 2014 00:51:00 +
Subject: [PATCH] rm: fix prompted number of arguments to remove on some
 platforms

zu was output on solaris 8 for example rather than the number,
since coreutils-8.22.

* cfg.mk: Disallow %zu with fprintf() since we make minimal
use of this function and so don't employ the gnulib replacement.
* src/rm.c (main): Use %PRIuMAX rather than %zu for portability.
* NEWS: Mention the bug fix.
Reported in http://bugs.gnu.org/19184
---
 NEWS |  3 +++
 cfg.mk   |  9 +
 src/rm.c | 10 +-
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/NEWS b/NEWS
index 27847d4..3a62656 100644
--- a/NEWS
+++ b/NEWS
@@ -25,6 +25,9 @@ GNU coreutils NEWS-*- outline -*-
   character at the 4GiB position.
   [the bug dates back to the initial implementation]
 
+  rm indicates the correct number of arguments in its confirmation prompt,
+  on all platforms.  [bug introduced in coreutils-8.22]
+
 ** New features
 
   chroot accepts the new --skip-chdir option to not change the working directory
diff --git a/cfg.mk b/cfg.mk
index 7347322..c6fc0e7 100644
--- a/cfg.mk
+++ b/cfg.mk
@@ -223,6 +223,15 @@ sc_prohibit-j-printf-format:
 	   { echo '$(ME): Use PRI*MAX instead of %j' 12; exit 1; }  \
 	  || :
 
+# coreutils doesn't use any fprintf gnulib replacement since we
+# make minimal use of fprintf, to output prompt strings mainly.
+# Here we disallow %zu with fprintf() as that's not portable to
+# Solaris 8 for example.
+sc_prohibit-z-fprintf-format:
+	@cd $(srcdir)/src  GIT_PAGER= git grep -A4 ' fprintf (' | grep %zu \
+	   { echo '$(ME): Use PRI*MAX instead of %z' 12; exit 1; }	 \
+	  || :
+
 # Ensure the alternative __attribute (keyword) form isn't used as
 # that form is not elided where required.  Also ensure that we don't
 # directly use attributes already defined by gnulib.
diff --git a/src/rm.c b/src/rm.c
index f7adf5b..4c8ee6e 100644
--- a/src/rm.c
+++ b/src/rm.c
@@ -332,18 +332,18 @@ main (int argc, char **argv)
quote (/));
 }
 
-  size_t n_files = argc - optind;
+  uintmax_t n_files = argc - optind;
   char **file =  argv + optind;
 
   if (prompt_once  (x.recursive || 3  n_files))
 {
   fprintf (stderr,
(x.recursive
-? ngettext (%s: remove %zu argument recursively? ,
-%s: remove %zu arguments recursively? ,
+? ngettext (%s: remove %PRIuMAX argument recursively? ,
+%s: remove %PRIuMAX arguments recursively? ,
 select_plural (n_files))
-: ngettext (%s: remove %zu argument? ,
-%s: remove %zu arguments? ,
+: ngettext (%s: remove %PRIuMAX argument? ,
+%s: remove %PRIuMAX arguments? ,
 select_plural (n_files))),
program_name, n_files);
   if (!yesno ())
-- 
2.1.0



bug#19154: [PATCH] Extend file size support in paste.

2014-11-23 Thread Pádraig Brady
On 23/11/14 18:50, Tobias Stoeckmann wrote:
 The function paste_parallel just has to remember if there was a character
 in a line at all, not how many.  Changing size_t to a simple boolean
 statement removes a possible overflow situation with 4 GB files on 32 bit
 systems.
 ---
  src/paste.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/src/paste.c b/src/paste.c
 index 2ca75d0..630a9a6 100644
 --- a/src/paste.c
 +++ b/src/paste.c
 @@ -235,7 +235,7 @@ paste_parallel (size_t nfiles, char **fnamptr)
  {
int chr IF_LINT ( = 0);/* Input character. */
int err IF_LINT ( = 0);/* Input errno value.  */
 -  size_t line_length = 0;/* Number of chars in line. */
 +  bool foundchar = false;/* Found chars in a line. */
  
if (fileptr[i])
  {
 @@ -250,7 +250,7 @@ paste_parallel (size_t nfiles, char **fnamptr)
  
while (chr != EOF)
  {
 -  line_length++;
 +  foundchar = true;
if (chr == '\n')
  break;
xputchar (chr);
 @@ -259,7 +259,7 @@ paste_parallel (size_t nfiles, char **fnamptr)
  }
  }
  
 -  if (line_length == 0)
 +  if (!foundchar)
  {
/* EOF, read error, or closed file.
   If an EOF or error, close the file.  */
 

Nice one. This would result in truncation of the output
iff there was a \n at position 2^32 in the file on a 32 bit system.
I'll apply that in a little while (I can't think of an efficient way to test).

Did you notice this through inspection or compiler warnings wrt integer 
overflow ?

thanks!
Pádraig.





bug#18499: Possible mv race for hardlinks (rhbz #1141368 )

2014-11-21 Thread Pádraig Brady
On 21/11/14 08:29, Boris Ranto wrote:
 On Fri, 2014-11-21 at 03:30 +, Pádraig Brady wrote:
 We want to leave the logic in place for cp and install though,
 and I've adjusted your patch accordingly. I've also adjusted
 the tests to pass and augmented the tests to cover one of
 the cases missed in the previous patch.  I'll push this tomorrow.

 thanks,
 Pádraig.
 
 Just a note: cp already presented this behaviour before the patch, i.e. 
 
 cp a b
 
 on hard links to the same file failed with 
 
 cp: ‘a’ and ‘b’ are the same file
 
 On the other hand, install does not present it, it copies over b
 creating new inode for b.

Yep, but there were other cases with `cp -a` with hardlinks
to symlinks, and cp --remove-destination a b.

I've pushed that now.

thanks!
Pádraig.





bug#18827: [Feature request] no CR for yes

2014-11-21 Thread Pádraig Brady
tag 18827 wontfix
close 18827
stop

On 25/10/14 18:59, Pádraig Brady wrote:
 On 10/25/2014 03:53 PM, George Shuklin wrote:
 Yes is very nice to generate large repeating patterns, but it always adds
 \n at the end. It's OK for the string data but sometimes mess with binary.

 Any way to disable it will be really appreciated. F.e. -n key (like for
 'echo'), or any other.
 
 Does this suffice?
 
   yes whatever | tr -d '\n'

Closing this now since existing tools can do it quite efficiently.
Some more examples:

text lines clocking patterns: (pass to tr -d '\n' to remove new lines)

  101010...   yes 1$'\n'0
  11...   yes 1
  111000...   yes 1$'\n'0 | sed 'p;p'
  123456...   seq inf


binary pattern generation:

  010101...   tr '\0' 'U'  /dev/zero


cheers,
Pádraig.





bug#18499: Possible mv race for hardlinks (rhbz #1141368 )

2014-11-21 Thread Pádraig Brady
On 21/11/14 11:53, Pádraig Brady wrote:
 On 21/11/14 08:29, Boris Ranto wrote:
 On Fri, 2014-11-21 at 03:30 +, Pádraig Brady wrote:
 We want to leave the logic in place for cp and install though,
 and I've adjusted your patch accordingly. I've also adjusted
 the tests to pass and augmented the tests to cover one of
 the cases missed in the previous patch.  I'll push this tomorrow.

 thanks,
 Pádraig.

 Just a note: cp already presented this behaviour before the patch, i.e. 

 cp a b

 on hard links to the same file failed with 

 cp: ‘a’ and ‘b’ are the same file

 On the other hand, install does not present it, it copies over b
 creating new inode for b.
 
 Yep, but there were other cases with `cp -a` with hardlinks
 to symlinks, and cp --remove-destination a b.
 
 I've pushed that now.

For reference I've made the kernel renameat() suggestion at:
http://marc.info/?l=linux-apim=141658005205610w=2

Pádraig.





bug#19148: ls --inode --sort=inode

2014-11-21 Thread Pádraig Brady
On 21/11/14 23:57, 積丹尼 Dan Jacobson wrote:
 $ man ls
--sort=WORD
   sort  by  WORD instead of name: none (-U), size (-S), time (-t),
   version (-v), extension (-X)
 
 Perhaps add new functionality: inode (-i)

Yes maybe, especially when combined with -R.
Do you have a specific use case to help decide on applicability.
Note `find | sort` seems more suited for low level access like this.

thanks,
Pádraig.






<    6   7   8   9   10   11   12   13   14   15   >