from:"Alan Curry"

make check (8.1) hangs forever

2009-11-21 Thread Alan Curry

The last message I get from make check is:
  PASS: misc/help-version

Then it sleeps for a long time. During this time, the inotify-race test is
running. There's a timeout 10s gdb ... process that's been running for a
lot longer than 10 seconds.

After ^C stops the make, the timeout, gdb, and a couple of tail processes are
lingering and have to be killed manually.

So far I've looked at it with strace, which revealed that timeout has sent a
SIGTERM to gdb, but gdb has SIGTERM blocked so nothing happens.

This part of the script is doing what it was expected to do:
  tail --pid=$pid -f tail.out | (read; kill $pid)

The kill $pid is executed, but $pid is the timeout process, which responds to
the SIGTERM by passing along a SIGTERM+SIGCONT to gdb, and then waiting for
gdb to die, which never happens. Therefore $pid doesn't die, the tail --pid
never dies, and the script makes no further progress.

It seems this entire script depends, in both the fail and pass cases, on
the ability to end a gdb process by sending a SIGTERM, which doesn't actually
work.

My gdb is the one from Debian's 6.8-3 package. If necessary, I'll dig into
why it's refusing to deal with SIGTERM.

-- 
Hoping it won't be necessary,
Alan Curry

Re: stable coreutils-8.1 today, fingers crossed

2009-11-22 Thread Alan Curry

Jim Meyering writes:
 
 Gilles Espinasse wrote:
 ...
  [chroot-i486] root:/$ umask
  0022
  [chroot-i486] root:/$ rm -rf /usr/src/coreutils*
  [chroot-i486] root:/$ cd /usr/src
  [chroot-i486] root:/usr/src$ tar xf cache/coreutils-8.1.tar.gz
  [chroot-i486] root:/usr/src$ ls -ld /usr /usr/src /usr/src/coreutils-8.1
 ...
  drwxrwxrwx 13 root root 4096 Nov 18 18:55 /usr/src/coreutils-8.1
 
  don't know why
 
  Just the side effect of using tar as root
  --no-same-permissions let umask be applied
 
 Thanks for explaining.
 That's another good reason to do less as root.

So was the drwxrwxrwx in the tarball put there to teach a lesson to those
who trust a tarball to have sane permissions? Or is it a bug?

-- 
Alan Curry

Re: make check (8.1) hangs forever

2009-11-22 Thread Alan Curry

Jim Meyering writes:
 
 P=C3=A1draig Brady wrote:
  Alan Curry wrote:
  SIGTERM to gdb, but gdb has SIGTERM blocked so nothing happens.
 
  thanks for investigating.
  Perhaps we need to use `timeout -sKILL ...`
 
 Sounds good to me.

I added that and re-ran make check. It worked but gdb's child process
(tail -f file) is still lingering afterward until I kill it manually.

Why has nobody else noticed this? Are other versions of gdb less
stubborn? Maybe I did something to make it stubborn, but I don't
know what that could be.

In case you're keeping score:

Debian 5.0r3, ppc32

All 366 tests passed
(45 tests were not run)
make[4]: Leaving directory `/tmp/coreutils-8.1/tests'

All 177 tests passed
(14 tests were not run)
make[6]: Leaving directory `/tmp/coreutils-8.1/gnulib-tests'

-- 
Alan Curry

Re: errors on date

2009-11-26 Thread Alan Curry

=?ISO-8859-1?Q?P=E1draig_Brady?= writes:
 
 David Gonzalez Marquez wrote:
  Hi!
  
  I am student of computer science at university of Buenos Aires. I am
  using the command date for calculing days. I need to calculate the
  following day for any day. Doing that I see a error.
  
  I use for example: date --date 1920-05-02 1 days +%F
  
  and for the followings days, I see a error:
  
  date: invalid date `1920-05-01 1 days'
 
 Seems to work fine here with coreutils 7.2 and 8.1
 What version of date are you using?

The dates in question are timezone transitions (mostly daylight savings
time, but the first one was a transition from old-fashioned local time to a
modern time zone).

date is filling in 00:00:00 since no specific time was specified, and trying
to find the time_t corresponding to 1920-05-01 00:00:00 and it fails because
that time never existed in Buenos Aires. It never gets as far as trying to
add the 1 days.

In Argentina, the jump forward happens at 00:00:00 (according to my reading
of the tzdata file), so if you use 12:00:00 you should be safe for any day.

$ TZ=America/Buenos_Aires date --date 1920-05-01 12:00:00 1 days +%F
1920-05-02

-- 
Alan Curry

Re: make check (8.1) hangs forever

2009-12-03 Thread Alan Curry

=?ISO-8859-1?Q?P=E1draig_Brady?= writes:
 
 Note my gdb --version is 6.8.50.20090302-39.fc11

So there's probably a bug in my 6.8-debian that is fixed in that version.

To follow up on that theory, I compiled gdb 7.0 and tried again. It passes. I
believe the difference is this change from 2008-07-10 (after the 6.8 release,
but before your 20090302 snapshot):

  (linux_nat_kill): Stop lwps before killing them.

The ptrace man page says For requests other than PTRACE_KILL, the child
process must be stopped. When I strace gdb 6.8, I see it using PTRACE_KILL
yet the child process doesn't die. gdb 7.0 does the same thing but inserts a
tkill(SIGSTOP) first and it works. Maybe a kernel change slipped past the
man page maintainer.

Possible solutions:

  timeout -sKILL (as seen already, leaves a lingering tail process)

  skip the test if gdb version  7.0 (except known-patched distro versions?)

  teach timeout to do a whole process tree instead of just a pgrp

Over the course of various experiments, I found another flaw in this test:
tail_forever_inotify can be inlined, and then gdb can't break on it.

-- 
Alan Curry

Re: git coreutils 'make check' hangs

2009-12-08 Thread Alan Curry

Ralf Wildenhues writes:
 
 Hi Pádraig,
 
 * Pádraig Brady wrote on Tue, Dec 08, 2009 at 02:00:20AM CET:
  Ralf Wildenhues wrote:
   for some time now, 'make check' in the git coreutils tree hangs for me:
 [...]
  I think it may be a gdb issue:
  http://lists.gnu.org/archive/html/bug-coreutils/2009-12/msg00025.htm
  For the moment we've marked that test as very expensive
  so that it will not be run by default.
 
 Ah, ok, I didn't see that thread.  Sorry about the noise then.
 

If yours is caused by the same thing (gdb failing to kill its child process
and exit) could you tell me your uname -a? I'd like to be sure it's not just
an arch-dependent kernel bug before reporting it as a man page bug. Something
still needs to get fixed, even if coreutils no longer cares about it.

-- 
Alan Curry

Re: btwowc(EOF) hang with gcc 4.4.2

2009-12-16 Thread Alan Curry

Jim Meyering writes:
 
 
 The code in question is calling btowc(EOF), which uses
 this definition from wchar.h:
 
 extern wint_t __btowc_alias (int __c) __asm (btowc);
 extern __inline wint_t
 __NTH (btowc (int __c))
 { return (__builtin_constant_p (__c)  __c = '\0'  __c = '\x7f'
   ? (wint_t) __c : __btowc_alias (__c)); }
 
 Since I don't even see any code that might loop there,
 (though I didn't look up what __asm (btowc) does)
 I'd suspect a compiler problem.

That asm seems to be one of these:
  http://gcc.gnu.org/onlinedocs/gcc-4.4.2/gcc/Asm-Labels.html

It's saying that the function __btowc_alias is actually to be named btowc
in the generated assembly code. This enables the inline function named btowc
to call an external function also named btowc, by going through that alias.
Tricky of them!

 
  Running gdb on conftest gets a backtrace like this:
  #0  0x080491d1 in btowc (__c=-1) at /usr/include/wchar.h:331
  #1  0x08049431 in btowc () at /usr/include/wchar.h:332
  #2  main () at conftest.c:276
 
  Where that line 331 is:
  331 { return (__builtin_constant_p (__c)  __c = '\0'  __c = '\x7f'
 
  This does not happen with other versions of gcc.  I can't tell if it's a
  gcc bug or a system include file bug or something actually in coreutils,
  but here's the report anyway.  It was pretty unsettling to have
  configure hang.

To decide whether a compiler bug is the answer, make a copy of the conftest.c
during the hang, run gcc -c -save-temps on it, and publish the resulting .i
and .s files for inspection. The conftest programs are already pretty minimal
so it should be easy to determine whether the assembly code correctly
corresponds to the preprocessed C code.

Re: btwowc(EOF) hang with gcc 4.4.2

2009-12-17 Thread Alan Curry

Karl Berry writes:
 
 Hi Alan,
 
 run gcc -c -save-temps on it, and publish the resulting .i and .s
 files for inspection. The conftest programs are already pretty
 minimal so it should be easy to determine whether the assembly code
 correctly corresponds to the preprocessed C code.
 
 I'm afraid my x86 assembler knowledge is near-nil, so it's not easy for
 me :).  Before I try to make this an official gcc bug report, maybe you

I did mean easy for the collective mind of the mailing list.

 could take a look/ The attached tar file has the .i and the .s both for
 -O (which gets a seg fault) and -O2 (which hangs).  With no -O option at
 all, the binary exits (successfully).  The presence or absence of -g
 makes no difference.  I threw in the .c and .h for the heck of it.

It's definitely a compiler problem. That extern inline asm alias trickery
failed to work. (Much effort there to optimize a function that according to
its own man page should never be used)

It ended up as Andreas Schwab suggested: an infinite tail recursion. -O1
segfaults eventually because the recursion grows the stack to infinite size.
-O2 optimized the recursion into a jump, eliminating the stack growth.

In the future, .s files are usually better without -g (as long as you're not
looking for a bug in the part of the compiler that generates the debugging
info). The assembler directives that produce the debugging symbols add a lot
of clutter.

-- 
Alan Curry

Re: btwowc(EOF) hang with gcc 4.4.2

2009-12-21 Thread Alan Curry

Karl Berry writes:
 
 It's definitely a compiler problem. That extern inline asm alias trickery
 
 The gcc people say that the behavior is correct; not a bug.
 (I don't understand all of their replies, but the conclusion seems clear.)
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42440

OK, I understood the replies so I'll try to sum up:

The oxymoron extern inline used to have one interpretation before the
inline keyword was standardized in C99. In that standardization, a different
interpretation for extern inline was mandated. The inline/alias trick used
by glibc here needs the old interpretation, which should be requested with
the gnu_inline attribute. Your version of glibc doesn't specify gnu_inline.
So the problem boils down to: your gcc is too new for your glibc. Downgrade
one or upgrade the other.

Re: rm - bug or user error?

2010-01-26 Thread Alan Curry

Michael Webb writes:
 
 I am within a directory containing directories dir1 and dir2 and *no* 
 files starting with f.
 
shell  rm -rf dir1 dir2 f*
rm: No match.
[...]
 I suspect the No match is coming from the command line parsing and not 
 rm itself.  However, the message starts with rm.

That's just how [t]csh reports non-matching globs: prefixed with the name of
the command that wasn't run because of the error. This might help you figure
out which line in a script had failed if you had multiple commands with
globs.

Since rm wasn't ever actually run, it had no influence on the format of the
error message.

-- 
Alan Curry

Re: rm - bug or user error?

2010-01-26 Thread Alan Curry

Jon Stanley writes:
 
 Yeah, like Eric said, I think that this is a csh problem rather than a
 coreutils problem. I would even think that csh is behaving wrongly
 here - rather than refusing to run rm because the glob didn't match,
 it should pass the f* straight through to rm to deal with as it
 pleases, unless you explicitly told the shell to fail (as Eric did in
 his example). I don't have any standards to back that up though, Eric
 is the POSIX-citing guy around here :)
 
 Any standards for that Eric?

csh is not Bourne shell. This is one of the things that csh got right, which
Bourne shell had wrong[1]. Sadly Bourne's behavior got the blessing of POSIX
and csh didn't. But csh isn't being wrong here, just being csh.

[1] Consider, in a Bourne/POSIX style shell, how different the two possible
behaviors are for:
  grep a.*b foo
depending on whether the word a.*b matches as a glob or not. People write
junk code like this, it works because of the old Bourne shell misfeature of
passing non-matching globs straight through, and then much later it
mysteriously breaks because a file called a.b has been created. We'd be
much better off if non-matching globs had always been treated as errors.

-- 
Alan Curry

Re: Stty bug?

2010-02-02 Thread Alan Curry

Tom writes:
 
 I'm using Ubuntu 9.10 (32-bit).
 I'm trying to use an ASR-33 Teletype (uppercase only)

Are you trying to write the best message this mailing list has ever aseen?
Because so far, it is.

 on ttyS0.  I have -U specified in getty for uppercase conversions and
 
 stty -F/dev/ttyS0 
 
 shows this:
 
 speed 110 baud; line = 0;
 -brkint ixoff iuclc -imaxbel
 olcuc
 -iexten xcase
 
 As you can see, iuclc is set but no commands work

I believe -iexten may be the problem. In your tty1 test, was iexten enabled?
Maybe a getty bug, it should be turning on iexten if it turns on iuclc.

Re: Ubuntu stty

2010-02-10 Thread Alan Curry

Bob Proulx writes:
 
 Tom Lake wrote:
  I have an ASR-33 Teletype on ttyS0 which can only output uppercase
  characters that I'm trying to use as a serial console.
 
 I like it.

I already answered this the first time it was sent. The archive has my
message:
  http://lists.gnu.org/archive/html/bug-coreutils/2010-02/msg00020.html

It didn't get through to Tom directly, though. His part of the Internet
doesn't accept mail from my part. Someone in the middle (Bob?) could maybe
tell him that he's missing out, by not being subscribed to the list and not
checking the archive.

-- 
Alan Curry

Re: Stty bug?

2010-02-12 Thread Alan Curry

Tom Lake writes:
 
  I believe -iexten may be the problem. In your tty1 test, was iexten 
  enabled?
  Maybe a getty bug, it should be turning on iexten if it turns on iuclc.
  
  I was able to recreate the original problem.  Reading Alan's response
  here I was surprised to see that adding iexten to iuclc did enable the
  desired behavior.  This would not have been required on traditional
  systems and its combination here isn't obvious to me.  There doesn't
  seem to be very much documentation available about iexten.  I am
  curious how you deduced that adding it would produce the desired
  behavior?
 
 Sorry to report that it didn't work for me.  I tried both iexten and -iexten
 for good measure.  It seems like whenever getty is respawned, it changes 
 /dev/ttyS0 to whatever it wants no matter what stty does.  I still couldn't 

To do a successful test without patching getty, you'd have to do the stty
from another terminal after entering the username but before the password.
Or make a test account with a password that doesn't have any letters in it.

Either way, even with the buggy getty, you should at least get an
all-uppercase PASSWORD: prompt after typing an uppercase username. Does the
password prompt look OK? (What would it look like if the computer was sending
it lowercase ASCII codes? garbage I assume, so a readable password prompt is
a sign things are working up to that point.)

By the way, suggested getty diff:

--- agetty.c.orig   2010-02-12 18:12:46.0 -0500
+++ agetty.c2010-02-12 18:13:38.0 -0500
@@ -1138,7 +1138,7 @@
 /* General terminal-independent stuff. */
 
 tp-c_iflag |= IXON | IXOFF;   /* 2-way flow control */
-tp-c_lflag |= ICANON | ISIG | ECHO | ECHOE | ECHOK| ECHOKE;
+tp-c_lflag |= ICANON | ISIG | ECHO | ECHOE | ECHOK| ECHOKE | IEXTEN;
   /* no longer| ECHOCTL | ECHOPRT*/
 tp-c_oflag |= OPOST;
 /* tp-c_cflag = 0; */

-- 
Alan Curry

Re: chmod directory mode extention

2010-02-20 Thread Alan Curry

seaking1 writes:
 
 Hello,
 
   I would like to suggest and offer the code to extend chmod in a small
 way.  My extension merely allows for a different mode to be applied to
 directories than the one applied to all other files.  I have looked for this
 utility but never found it and being such a useful addition though it may be
 possible to add it to the standard release.
 
   What it does: Add an option -d|--dirmode to chmod that will give all
 directories in the files chmod is told to change the mode specified by the
 -d argument instead of the other mode.
   Reason: I have found this to be a necessity.  Say for example I have a
 directory structure filled with data files of some sort and they are of
 assorted permissions; If I chmod -R 664 foo/ or something to that effect It
 will of course give all the permissions 664 including directories hence
 making them inaccessible.  With this is could run chmod -R -d 775 644 foo/
 and give the directories the permissions 775.

chmod -R ug=rwX,o=rX is pretty close to what you're asking for. The only
difference is that the X also adds x permission to files that already have at
least one x bit.

Your suggestion is more generalized, so no necessarily a bad idea. I just
mention this because lots of people overlook the +X option.

-- 
Alan Curry

bug#5970: regex won't do lazy matching

2010-04-19 Thread Alan Curry

a g writes:
 
 This may be a usage problem, but it does not exist with other regex packages
 (such as slre) and I can't find anything in the documentation to indicate
 that the syntax should be different for coreutils. I am using coreutils 8.4
 on ubuntu AMD64, version 9.10. I cannot get the coreutils regex matcher to
 do lazy matching. Here is my code:

By lazy do you mean non-greedy?

 Here is the problem. If you execute:
   regex_test a[^x]*?a a1a2a

The non-greedy quantifiers like *? are not part of standard regex, they are
extensions found in perl, and in other packages inspired by perl.

-- 
Alan Curry

bug#5958: Sort-8.4 bug

2010-04-20 Thread Alan Curry

A full investigation has revealed:

This bug was introduced between coreutils 7.1 and 7.2, here:

commit 224a69b56b716f57e3a018af5a9b9379f32da3fc
Author: PÃ¡draig Brady p...@draigbrady.com
Date:   Tue Feb 24 08:37:18 2009 +

sort: Fix two bugs with determining the end of field

* src/sort.c: When no specific number of chars to skip
is specified for the end field, always skip the whole field.
Also never include leading spaces from next field.
* tests/misc/sort: Add 2 new tests for these cases.
* NEWS: Mention this bug fix.
* THANKS: Add bug reporter.
Reported by Davide Canova.

In the diff of that commit, an eword++ was removed from the case 'k' section
of option parsing, where it did not affect traditional options, and added to
the limfield() function, where it takes effect regardless of how fields were
specified.

So it fixed a -k option parsing bug and added a traditional option parsing
bug. And on the way, it removed a comment describing the correct
correspondence between the two!

The following patch moves the eword++ back to its old location (under the
case 'k') but keeps the new test for when it should be applied (echar==0,
whether by explicit .0 on the field end specifier or by omission of the field
end specifier). This allows the -k bug that was fixed to stay fixed, while
undoing the damage to the traditional options.

With this patch applied, all the sort tests in make check still pass,
including the tests added in the above commit, which I take as a sign that I
got it right. And the traditional options are back to working again.

I'd suggest the following new test case:

printf a b c\na c b\n | sort +0 -1 +2

should output a c b\na b c\n

I'd put that in the diff too, but the organization of tests/misc/sort is
baffling.

--- coreutils-8.4.orig/src/sort.c   2010-04-20 02:45:35.0 -0500
+++ coreutils-8.4/src/sort.c2010-04-20 03:12:57.0 -0500
@@ -1460,9 +1460,6 @@
   char *ptr = line-text, *lim = ptr + line-length - 1;
   size_t eword = key-eword, echar = key-echar;
 
-  if (echar == 0)
-eword++; /* Skip all of end field.  */
-
   /* Move PTR past EWORD fields or to one past the last byte on LINE,
  whichever comes first.  If there are more than EWORD fields, leave
  PTR pointing at the beginning of the field having zero-based index,
@@ -3424,6 +3421,8 @@
   s = parse_field_count (s + 1, key-echar,
  N_(invalid number after `.'));
 }
+  if (key-echar == 0)
+key-eword++; /* Skip all of end field.  */
   s = set_ordering (s, key, bl_end);
 }
   if (*s)

OK now let's not say I haven't done any legwork.

-- 
Alan Curry

bug#6007: en_US sorting is completely stupid.

2010-04-22 Thread Alan Curry

Bob Proulx writes:
 
 You don't like it and I don't like it but the-powers-that-be have

Who's the power here anyway? Who do we have to impeach? Seriously. The
en_US locale is an unmitigated disaster. It's officially called not a bug
every time it comes up, which seems to be once a week on this list alone, so
what volume of complaints is required to tip the balance to all right it's a
damn bug let's fix it?

From the name en_US one might guess that it represents the behavior
expected by English-speaking users in or from the US. But those users have
lived with computers for a generation or two. What they expect is
ASCIIbetical. The only people who actually expect phone-book-style sorting
are old geezers who remember what a phone book was. Most of them have never
used a computer and never will, so why do we (and by we I mean whoever
makes the locale rules) bend the default to accommodate them?

-- 
Alan Curry

bug#6007: en_US sorting is completely stupid.

2010-04-23 Thread Alan Curry

Andreas Schwab writes:
 
 Alan Curry pacman...@kosh.dhis.org writes:
 
  Who's the power here anyway?
 
 You are, actually.  Everyone can define locales to behave the way he
 likes, see localedef(1).

I avoid this by not having any locales installed. But that doesn't help all
the other victims.

 
  From the name en_US one might guess that it represents the behavior
  expected by English-speaking users in or from the US. But those users
  have lived with computers for a generation or two. What they expect is
  ASCIIbetical.
 
 Nowadays most people don't know what ASCII is.

They may not know how to name it, but they do complain when it isn't used,
enough that it's a FAQ.

People install a GNU/Linux distribution, pick English from the language
menu, and get a set of sorting rules that doesn't makes sense. Sorry, should
have told the installer you speak C.

Donna Summer just doesn't belong between Don Adams and Don Pardo, and
everyone knows it. Not a bug? Bah. Not a coreutils bug, but it's a bug. If
glibc was in the same bug tracking system with coreutils, reports like this
one could be reassigned there.

-- 
Alan Curry

bug#5926: feature request: mv -p to create missing target dir

2010-04-24 Thread Alan Curry

Bob Proulx writes:
 
 As a side comment I don't see the point of:
 
  $(which mv) $@
 

I can guess the point:

bash$ alias mv='mv -i'
bash$ touch a b
bash$ mv a b
mv: overwrite `b'? ^C
bash$ $(which mv) a b
bash$ ls -l a b
ls: cannot access a: No such file or directory
-rw--- 1 pacman users 0 Apr 24 17:55 b

Silly aliases.

-- 
Alan Curry

bug#6056: base32 output for md5sum sha1sum etc.

2010-04-27 Thread Alan Curry

In the dark ages before the bug tracker (i.e. November), a message was sent:

http://lists.gnu.org/archive/html/bug-coreutils/2009-11/msg00206.html

providing an RFC4648 base32 output option for the cryptographic hash
utilities. I'm sending this now to

1. endorse the idea
2. get it a bug number so it might be noticed

bug#6056: base32 output for md5sum sha1sum etc.

2010-04-30 Thread Alan Curry

Jim Meyering writes:
 
 tags 6056 + moreinfo

I've given all the moreinfo I could. I thought It's a standards-track RFC
plus seen in the wild would have been enough. And the applications where
it's relevant (Gnutella, Bitzi) are pretty well-known.

  md5sum ... |
  perl -anle 'use Convert::Base32;\
  $h32=3Duc(encode_base32(pack(H40, $F[0]))); print $h32  $F[1];'
 
 This is a strong argument not to encumber the tool
 with a new option.

Since it's broken, I saw it as an argument the other way. The 16to32
converter is just complex enough that attempts to do it freehand will not
quite be right.

But seeing a GNU maintainer argue against a new option based on the
bloat/benefit ratio is a pleasant surprise. From the people who gave us
sed -i (that's just a redirect and a mv) and grep -r (why learn to use find
when you can just add recursion to every tool). I don't want to fight this
trend too hard.

-- 
Alan Curry

bug#6104: [Expert] Bug in mv?

2010-05-05 Thread Alan Curry

Note: I saw this on bug-coreutils, haven't read the whole thread.

Gene Heskett writes:
 
 On Tuesday 04 May 2010, Jo=E3o Victor Martins wrote:
 On Tue, May 04, 2010 at 10:36:19PM -0400, Gene Heskett wrote:
  I tried to mv amanda* /home/amanda/* as root and which
  which I recall I have done successfully several times before.
 
 The shell expand * _before_ passing the args to mv.  So mv saw all
 files starting with 'amanda' and all files (besides . hidden ones) i=
 n
 /home/amanda/ as arg.  It then picked the last one listed (probably
 /home/amanda/tmp/) as destination.
 
 I had two files whose names started with amanda in that directory.  I
  would
 have assumed it would expand the src pattern of amanda* to match on
 ly those

It's not the first * that's the problem. The second one (/home/amanda/*)
expands to a list of everything that was in /home/amanda (except dotfiles)
and that happens before mv is executed. There are several possibilities of
what that command can do:

1. /home/amanda contained no files before the move. In that case the
/home/amanda/* is passed through literally as the final argument to mv, so mv
sees 3 arguments (your 2 files, then /home/amanda/* which doesn't exist)
and it fails, because with more than 2 arguments, the last argument must be
an existing directory.

2. /home/amanda contained some stuff, and the last item in the expanded list
(alphabetically sorted) was not a directory. Same result as #1.

3. /home/amanda contained some stuff, and the last item in the expanded list
happened to be a directory (say you have a directory called
/home/amanda/): then the list expands, the final argument to mv is an
existing directory, so you have success! Your 2 files, plus everything in
/home/amanda, gets moved into the  directory. If this isn't what you
meant, you did something wrong. mv just did what it was told.

4. Like #1, but with a nomatch shell option enabled, you get a No match
error message.

Your career as a unix wizard isn't complete until you've done something like
#3 *on purpose*.

-- 
Alan Curry

bug#6897: date -d '1991-04-14 +1 day' fails

2010-08-22 Thread Alan Curry

Bob Proulx writes:
   date -d '1991-04-14 12:00 +1 day'
 
  I'm from china by the way, and the time zone I am in and to which
  the systems were set is GMT8(or CST, China Standard Time).

Indeed, 

  TZ=Asia/Shanghai date -d '4/14/1991'
  date: invalid date `4/14/1991'
  TZ=Asia/Shanghai date -d '4/14/1991 01:00:00'
  Sun Apr 14 01:00:00 CDT 1991
  TZ=Asia/Shanghai date -d '1/1/1970 GMT + 671558399 sec'
  Sat Apr 13 23:59:59 CST 1991
  TZ=Asia/Shanghai date -d '1/1/1970 GMT + 671558400 sec'
  Sun Apr 14 01:00:00 CDT 1991

According to tzdata, China had DST from 1986 to 1991.

This comment in the source file indicates some doubt about correctness:

# From Paul Eggert (2006-03-22):
# Shanks  Pottenger write that China (except for Hong Kong and Macau)
# has had a single time zone since 1980 May 1, observing summer DST
# from 1986 through 1991; this contradicts Devine's
# note about Time magazine, though apparently _something_ happened in 1986.
# Go with Shanks  Pottenger for now.  I made up names for the other
# pre-1980 time zones.

Maybe someone who can read Chinese could clear it up by finding the original
policy declarations...

 Please review the FAQ for date.
 
   http://www.gnu.org/software/coreutils/faq/#The-date-command-is-not-work=
 ing-right_002e

There might be less occurrences of this misunderstanding if we could teach
date that -d 4/14/1991 is not actually a request for 4/14/1991 00:00:00, but
any time that existed during the day 4/14/1991, or perhaps a more specific
the first second of 4/14/1991.

Has that been considered and rejected already, or is it just waiting for
someone to implement it?

bug#6897: date -d '1991-04-14 +1 day' fails

2010-08-23 Thread Alan Curry

Paul Eggert writes:
 
 On 08/22/10 18:09, Alan Curry wrote:
  There might be less occurrences of this misunderstanding if we could teach
  date that -d 4/14/1991 is not actually a request for 4/14/1991 00:00:00, but
  any time that existed during the day 4/14/1991, or perhaps a more specific
  the first second of 4/14/1991.
  
  Has that been considered and rejected already, or is it just waiting for
  someone to implement it?
 
 As far as I know nobody has ever suggested that, and it is a reasonable 
 suggestion.
 However, it would not fix the problem in general, since in some cases there
 is no first second of date X, even when X is valid.  For example:
 
 $ TZ=Pacific/Kwajalein date -d 1993-08-20
 date: invalid date `1993-08-20'

There's nothing wrong with that error message. It's telling the truth about
1993-08-28 being an invalid date.

But TZ=Asia/Shanghai date -d '4/14/1991' says:
date: invalid date `4/14/1991'

which is a lie. 4/14/1991 is not an invalid date. It made a bad assumption
(that midnight was intended, when the user didn't ask for midnight at all)
and then reported an error caused by the bad assumption, and didn't even have
the courtesy to mention the assumption.

Bonus thought: the date command is misnamed. If it actually worked with
dates, it wouldn't need to attach an hour, minute, and second to everything.
It would understand 4/14/1991 as representing an entire day, and + 1 day
added to it would represent the entire next day. But date doesn't work with
dates, it works with time_t's. This is not obvious to the casual user.

-- 
Alan Curry

bug#6949: Uniq command should allow total to be displayed

2010-08-30 Thread Alan Curry

Miles Duke writes:
 
 'uniq --count' is an excellent tool for summarizing data, but it is
 missing one useful feature - an overall total.

This might be a good idea...

 
 It's embarrassing to have to go to excel to bring the totals together.

...but you can't think of any other tool that can add up a bunch of numbers?!
You're using dynamite to kill a mosquito. There must be a dozen basic
utilities that can do arithmetic. Like awk:

  ... | uniq -c | awk '{t+=$1}END{print t,total}1'

-- 
Alan Curry

bug#7182: sort -R slow

2010-10-09 Thread Alan Curry

Ole Tange writes:
 
 I recently needed to randomize some lines. So I tried using 'sort -R'.
 I was astonished how slow that was. So I tested how slow a competing
 strategies are. GNU sort is two magnitudes slower than unsort and more
 than one magnitude slower than perl:

Never heard of unsort. Why didn't you try shuf(1)?

Also, your perl is not valid:

 
 $ time perl -e 'print sort { rand() = rand() } ' file
 real0m6.621s

That comparison function is not consistent (unless very lucky).

 I would expect sort -R to be faster than sort and faster than Perl if
 not as fast as unsort.

How big is your test file? I expect sort(1) to be optimized for big jobs. I
bet it would win the contest if you are shuffling a file that's bigger than
available RAM.

bug#7228: coreutils-8.6 sort-float failure on ppc

2010-10-16 Thread Alan Curry

Jim Meyering writes:
 
 Gilles Espinasse wrote:
  Just tested 8.6 on linux glibc-2.11.1/gcc-4.4.5 LFS build on x86, sparc and
  ppc
 
  First a good news is that on sparc (32-bits), 8.6 test suite is now passing
  I didn't report yet a failure on misc/stty which was
  Failure was
  + stty -icanon
  stty: standard input: unable to perform all requested operations
 
 Is that consistently reproducible?
 If so, you might want to investigate, but it's probably not a big deal.

I've seen that error message before, and I did investigate. It was caused by
glibc's tcsetattr()/tcgetattr() being too clever, trying to support fields
that didn't exist in the kernel's termios struct. The kernel struct is
arch-specific so it's not surprising that an arch-specific bug would show up
here.

I've only seen it with speed changes. stty 115200 /dev/ttyS0 makes the
change succesfully, but complains. The kernel termios struct may or may not
have separate speed fields for input and output, but glibc likes to pretend
that they're both there, and somehow stty gets confused by glibc's fakery.
strace doesn't give any clues because it shows the real kernel structures.

See sysdeps/unix/sysv/linux/{speed,tc[gs]etattr}.c in glibc source for the
full ugliness.

bug#7247: readdir obsoleteness?

2010-10-19 Thread Alan Curry

Ian Martin writes:
 

A message containing only ASCII characters which was nevertheless encoded as
quoted-unreadable, with its original newlines senselessly escaped, and then
more newlines injected, forming a bricktext with continuation markers. Does
yahoo send them out like this or is it a mailing list manager hatchet job?

Reformatted for sanity:

Hi,
just trawling the webpages, I got caught in a loop.  The syscalls page states: 

Don't read man pages you find on the web, unless you're deliberately looking
for information on old systems. Up to date man pages for Linux are at
ftp.$COUNTRYCODE.kernel.org:/pub/linux/docs/man-pages

...
Then there is __NR_readdir corresponding to old_readdir(), which will  read at 
most one directory entry at a time, and is superseded by  sys_getdents().

however, on the getdents man page: 

DescriptionThis is not the function you are interested in. Look at 
readdir(3)for 
the POSIX conforming C library interface. This page documents the bare kernel 
system call interface. 

You started at readdir(2), you ended at readdir(3). That's not a loop.
readdir(3) is the POSIXly portable C-level interface. readdir(2) and
getdents(2) are the Linux-specific implementations which you don't need to
know about unless you're writing code at or below the libc layer.

-- 
Alan Curry

bug#7433: ls: [manual] description for --directory is insufficient

2010-11-18 Thread Alan Curry

Eric Blake writes:
 
 The wording for -d may not mention it, but the wording at the very
 beginning of the --help and man page is clear that:
 
 | Usage: ls [OPTION]... [FILE]...
 | List information about the FILEs (the current directory by default).

In other words, to correctly predict the behavior of ls -d you must
read two pieces of information that are not immediately adjacent to each
other, and use a minimal amount of thought to decide whether and how
they influence each other.

For people who read documentation all the way through, knowing that a
thorough understanding of the available tools will be a long-term
benefit, this is not a problem. Let's call these people the smart
bears. They'll get the garbage can open easily because they're patient.

For people who only skim documentation, and not even that until they
have a problem, the obstacle is larger. If there isn't a single sentence
that tells them everything they need to know, they're not going to get
it. Let's call these people the dumb tourists. They're impatient with
the garbage can latch, because they're holding a smelly bag of garbage.

Smart bears see a thick instruction manual and say Hooray! Proper
documentation! I won't have to guess how it works. Dumb tourists see a
thick instruction manual and say Screw that, reading sucks, I can guess
how it works.

man pages are written by and for smart bears. Dumb tourists don't write
documentation. Sometimes they write web pages which they optimistically
call documentation.

Making documentation dumb-tourist-friendly inevitably makes it longer,
because it has to have a clause for each goal that the reader might want
to achieve, instead of just listing the facts and expecting the reader
to be able to put them together. The increased length bothers the smart
bears since it increases the time required to read the documentation
all the way through.

In the case of ls, I suggest that -d is special enough (since it affects
how the non-option arguments are used, unlike other ls options) that a
little extra length is justified. It would be reasonable to provide 2
separate SYNOPSIS lines, something like this:

SYNOPSIS
ls [OPTION]... [FILE]...
ls -d [OPTION]... [FILE]...

DESCRIPTION
The first form lists the given FILEs, and if any of them are
directories, the directory contents are listed. If no FILEs are
given, the contents of the current directory are listed.

The second form (with -d) lists the given FILEs, but any FILE
that is a directory will not have its contents listed. With no
FILEs given, the current directory (not its contents) is listed.

I don't care how you'd translate that to/from --help. I care about man
pages, not --help.

If that seems like giving too much attention to -d, how about this
alternative: add an EXAMPLES section. Dumb tourists love EXAMPLES
sections, and smart bears can safely skip them. It's a little bit
ridiculous that cat(1) has examples and ls(1) doesn't. ls has a lot more
options.

And the conflict between -R and -d should be explicitly mentioned. One
of them makes the other meaningless, and we should say which one.

-- 
Alan Curry

bug#7450: cp does not check for errno=EISDIR when target is dangling link

2010-11-20 Thread Alan Curry

=?UTF-8?Q?=D0=9C=D0=B0=D1=80=D0=BA_?= writes:
 
 How to reproduce:
 
 $ ln -s non-exist tgt
[...]
 $ cp /etc/passwd tgt/
 cp: cannot create regular file `tgt/': Is a directory
 
 Novices can not understand this message :)

The same confusing error message also occurs in the simpler case where the
target is simply any nonexistent name with a trailing slash.

$ ls -ld tgt
ls: cannot access tgt: No such file or directory
$ cp /etc/passwd tgt/
cp: cannot create regular file `tgt/': Is a directory

strace shows this:

open(tgt/, O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0600) = -1 EISDIR (Is a 
directory)

which I think is just bad kernel behavior. There's no errno (among the
classical errno values anyway) which completely expresses you tried to creat
something with a trailing slash, but I'd rather see ENOENT or ENOTDIR than
EISDIR.

When this open fails, nothing in sight Is a directory. The problem is that
tgt/ _would_ be a directory, but since it doesn't exist and isn't about to be
created, Is a directory is an overstatement.

cp should dodge this issue by never calling creat with a trailing slash. It
could supply a meaningful error message instead of one that is derived from
an errno.

-- 
Alan Curry

bug#7450: cp does not check for errno=EISDIR when target is

2010-11-21 Thread Alan Curry

Jim Meyering writes:
 
 Thanks for the suggestions.
 Here's the patch I'm considering:
 [I first patched copy.c's copy_reg, included at end,
  but didn't like that as much; the core copying code
  should not be encumbered like that, and other users of copy.c
  are not affected. ]

Cross-filesystem mv is pretty much the same. If one is confusing enough to
justify a change, I think the other is too.

$ cd /tmp
$ touch foo
$ ls -ld $HOME/nosuch
ls: cannot access /home/pacman/nosuch: No such file or directory
$ mv foo $HOME/nosuch/
mv: cannot create regular file `/home/pacman/nosuch/': Is a directory
$

-- 
Alan Curry

bug#8079: rm command problem/suggestion

2011-02-18 Thread Alan Curry

Luca Daniel writes:
 
 Hi there :-)
 I have o problem and an suggestion :
 1) The problem: I can't find an easy way to remove a type of file through
 all sub-directories with GNU tool rm (remove). There is not an option to
 search through all sub-folders , only in the current working directory. Back
 when I used windows this was easy with the command : del /s *.pdf   .

You misplace the blame on rm; the problem is that the standard unix shell
doesn't have recursive globbing. Doing it in the shell means that all
utilities benefit. rm is just one of them.

zsh does recursive globbing with a double-asterisk, so that for example

  rm **/*.pdf

would get rid of all files named *.pdf anywhere under the current directory.
bash also knows about the ** recursive glob, but I recommend zsh because it
has a lot more cool features, like

  **/*.(pdf|ps)(m+30Lk-500)

(recursive directory search, all files named *.pdf or *.ps, whose last
modification was more than 30 days ago, with a size less than 500k)

-- 
Alan Curry

bug#8090: strerror(1) and strsignal(1)?

2011-02-20 Thread Alan Curry

Bruce Korb writes:
 
 Hi Jim,
 
 On 02/20/11 15:20, Jim Meyering wrote:
  Bruce Korb wrote:
  Hi Bruce,
  
  [your subject mentions strsignal -- you know you can get a list
   via env kill --table, assuming you have kill from coreutils? ]

What's the installation rate of coreutils-kill vs. procps kill? Debian
chooses procps kill (except on Hurd and maybe freebsd-kernel)

  
  I've had that itch many times.
  Here are some handy bash/perl functions I wrote:
 
 Yep.  I know one can get to it via perl.  OTOH, _you've_ had that
 itch many times, Padraig's had that itch many times, and I'd take
 a wild guess that there have been a few others, too.  So it still

You guys don't perl-golf well.

perl -E'say$!=11'

or for older perls

perl -le'print$!=11'

 remains for the itchy folks to drag something around to new places
 whenever they go to a new environment.  Were it in coreutils,
 it would likely be more easily found.  It also fits well with my
 pet theory that library function names ought to have same-named
 commands lying about.  Thus, if you can remember strerror(3p),
 then by golly there's a strerror(1), too, with obvious options
 (none, in this case) and operands.

The important thing is that when you need to use this utility, you report a
bug on the program that printed a number instead of calling strerror(3)
itself. Error numbers are not a user interface, regardless of Microsoft's
attempt to train people otherwise.

 Nice.  I've copied them into my shell functions directory.
 I still think strerror(3p) ought to imply a strerror(1) command,
 but I leave it to you to decide.  It's just my preference.

Just as write(2) implies write(1), and time(2) implies time(1). Or something
like that.

-- 
Alan Curry

bug#8103: NUL terminated lines

2011-02-24 Thread Alan Curry

Bjartur Thorlacius writes:
 
 On 2/24/11, Jim Meyering j...@meyering.net wrote:
  Bjartur Thorlacius wrote:
  Maybe we should modify tac to add the -z option.  Would you care to
  write a patch?
  It would be redundant, as tac -s $'\0' is equivalent.

Note that a $'\0' argument in a shell command line is exactly equivalent to
an empty string, since it must be passed from the shell to the program using
execve() which takes NUL-terminated strings.

There is no way to run a program with an actual NUL byte contained in one of
its arguments. execve will stop copying at the NUL, and even if it didn't,
the new program receives its arguments in int argc, char **argv form so how
is it supposed to know that there's a NUL in there that's not a terminator?

This limitation can't be avoided. It's not just a C language thing. The
execve interface is based on NUL-terminated strings at the asm level too.

If tac -s $'\0' did something different from tac -s '', it could only have
been a shell builtin. (Assuming the shell supported the $'...' notation at
all)

-- 
Alan Curry

bug#8102: [head] do not return EXIT_SUCCESS upon premature EOF

2011-02-25 Thread Alan Curry

Bjartur Thorlacius writes:
 
 On 2/23/11, Eric Blake ebl...@redhat.com wrote:
  On 02/23/2011 11:58 AM, Bjartur Thorlacius wrote:
  That's because this is not a bug, but a POSIX requirement:
 
  http://pubs.opengroup.org/onlinepubs/9699919799/utilities/head.html
 
  When a file contains less than number lines, it shall be copied to
  standard output in its entirety. This shall not be an error.
 
 Indeed. Since it's explicitly mentioned, I assume there's a reason for
 it. I'd be grateful if someone could point out what the rationale beind
 the decision is (or better yet, where such information can be found).
 
 So should I be using a head-alike for iterating over lines, and would
 such an utility belong to a GNU package, or is awk the right tool for the
 job?

Here's what an iterate-over-lines loop normally looks like in a shell script:

  while read -r line
  do
something $line
  done

The idea of using head to control a loop means you are either a newbie who
didn't know about read, or you are trying to do something subtly different
which I didn't understand. Excuse me if I guessed the wrong one.

-- 
Alan Curry

bug#8231: Bug in the linux command: tail

2011-03-11 Thread Alan Curry

Eric Blake writes:
 
 Besides, we already have the convention that long options that require
 an argument mean that the associated short option also requires an
 option.  That is, we are already consistent in writing
 
 -n, --lines=K
 
 as shorthand for:
 
 -n K OR --lines=K


And a stupid convention it is. An equals sign that's distributive over a
comma! Anywhere else, the comma is a very-low-precedence operator.

The fact that the comma has whitespace on one side, while the equals sign has
no adjacent whitespace, provides reinforcing evidence that the equals sign
should bind tighter. But it doesn't.

The reader's only hope is to infer this GNU anti-readability convention by
reading the man page for a command they already know how to use, and then
apply that convention when reading about other commands.

The Linux man-pages project should take on section 1.

-- 
Alan Curry

bug#8231: Bug in the linux command: tail

2011-03-11 Thread Alan Curry

Eric Blake writes:
 
 On 03/11/2011 10:03 AM, Roger N. Clark wrote:
 
 
  This works on HP-UX.
 
 Since we're already dealing with two GNU extensions, I don't see why we
 can't be nice and make the shorter syntax work the way HP-UX is doing
 things.  Patches welcome!
 

Really? tail -N of multiple files used to work with GNU tail, before someone
broke it[1]. I have a local patch that I've been using since the breakage
first bit me.

Before the breakage, a comment explicitly stated that multiple filename
arguments were allowed. That comment was replaced with one explicitly stating
the opposite. Looks like an intentional feature removal to me. Patches
welcome? How about a revert?

[1] commit 99f09784cc98732a440de86bb99a46f11f7355d8

-- 
Alan Curry

bug#8408: A possible tee bug?

2011-04-01 Thread Alan Curry

George Goffe writes:
 
 Howdy,
 
 I have run several scripts and seen this behavior in all cases...
 
 tee somescript | tee somescript.log 21
 
 The contents of the log is missing a lot of activity... messages and so
 forth. Is it possible that there are other file descriptors being used for
 these messages?

I can't tell what you're trying to do from this incomplete example, but it
looks like you're expecting the 21 to do something other than what it's
actually doing. It's only pointing the second tee's stderr to wherever its
stdout was going.

If the above pipeline is run in isolation from an interactive shell prompt,
the 21 is accomplishing nothing at all, since stderr and stdout will
already be going to the same place (the tty) anyway.

tee's stderr will normally be empty; it would only print an error message
there if it had trouble writing to somescript.log.

Post a more complete description of your intent.

-- 
Alan Curry

bug#8408: A possible tee bug?

2011-04-01 Thread Alan Curry

George Goffe writes:
 Alan,
 
 Oops. I goofed... My apologies.
 
 The example would be this somescript | tee somescript.log 21.
 
 The intent is to capture all the output (stdout and stderr) from
 somescript. somescript runs several commands that may or may not utilize
 other FDs. I was hoping to get a better output than what you might get from
 the script command which records all the messages + a ton of other things
 like escapes which are a pain to eliminate.
 
 Does this make better sense?

Well, you still have the 21 in the wrong place. If you want it to affect
the stderr of the command to the left of the pipe, you have to put it to the
left of the pipe.

bug#8511: Sort error in makefile

2011-04-16 Thread Alan Curry

Harpal Shergill writes:
 
 Hello,
 
 I have  a makefile which does the following:
 
1. grab a .gz file from server using wget -- works fine
2. extract the data from .gz file to new file based on filter using ZCAT
-- works fine
3. sort the data from based on specific field and saves the data into a
new file -- DOES NOT WORK
 
 
- command looks like this: sort -t: -k2n inputFile outputFile
- this command works perfectly on command line of cygwin BUT fails
through makefile
- Error says: Input file specified two times.
 
 I tried to search on this online but couldn't get any info. Can you please
 help and show me what's wrong? In make file i have this command enclosed
 with ` character. Any feedback on this would be greatly appreciated.

Your makefile is running the DOS/Windows sort command instead of the
GNU/cygwin sort. Use a full path like /whatever/cygwin/bin/sort to make it
use the right one.

cygwin's bug, if a bug at all...

-- 
Alan Curry

bug#8423: Questions about checking out 6.7 using git

2011-04-17 Thread Alan Curry

Jim Meyering writes:
 
 tags 8423 notabug
 close 8423
 thanks

That's disappointing. I was looking forward to seeing a response to this
question. I also recently tried to find the origin of a bug with git bisect
and quickly ended up with an uncompilable mess.

If you want to see a complete demonstration, I could make another attempt and
log it all. But if you're trying to tell us that checking out old versions
from the repository and compiling them shouldn't be expected to work... then
you're wrong.

-- 
Alan Curry

bug#8578: 8.12 and 8.10 'ls -dl' appends ' ' (0x20: space) to

2011-04-28 Thread Alan Curry

Eric Blake writes:
 
 On 04/28/2011 12:34 PM, Jason Vas Dias wrote:
  I do:
 =20
  $ ls --version | grep '[(]G'
  ls (GNU coreutils) 8.12
 
 Thanks for the report.
 
  $ ls -dl /. | od -cx
 
 od -cx is not always the best choice in formatting - it depends on the
 endianness of your machine since it groups two bytes at a time.  I
 personally like 'od -c -tx1z' better for the type of output you are wanti=
 ng.
 
  000   d   r   w   x   r   -   x   r   -   x   .   2   5   r=
 
 726478772d727278782d202e35327220=

Did anyone else notice the '.' after the drwxr-xr-x part? I bet that's
what's confusing python.

   The  file  mode  written  under the -l,   -g, -n, and -o  options shall
   consist of the following format:

  %c%s%s%s%c, entry type, owner permissions,
  group permissions, other permissions,
  optional alternate access method flag

   The optional alternate access method flag shall be a  single  space
   if there is no alternate or additional access control method associated
   with the file; otherwise, a printable character shall be used.

bug#8587: Curious bug.

2011-04-29 Thread Alan Curry

Francois Boisson writes:
 
 On a debian squeeze amd64.
 
 francois@totoche:~$ echo ABCD Directory | tr [:lower:] [:upper:] 
 ABCD DIRECTORY
 francois@totoche:~$ cd /tmp
 francois@totoche:/tmp$ echo ABCD Directory | tr [:lower:] [:upper:] 
 tr: construit [:upper:] et/ou [:lower:] mal aligné

I can't read that error message but I can see what you did wrong.

[:upper:] is seen by the shell as a glob which matches these filenames:
  :
  e
  p
  r
  u
and likewise [:lower:] matches a different set of single-character filenames.

In one directory, you don't have any files named like that. In the other
directory, you do. When the glob matches nothing, the shell passes the string
[:upper:] or [:lower:] literally as an argument to the command. That's a
design flaw in the unix shell from its early days, which nobody has the guts
to fix.

Use '[:upper:]' and '[:lower:]' to make the shell treat them as literal
strings and not globs.

Switch to zsh for better diagnostics...

  % echo ABCD Directory | tr [:lower:] [:upper:]
  zsh: no matches found: [:lower:]
  % echo ABCD Directory | tr '[:lower:]' '[:upper:]'
  ABCD DIRECTORY

-- 
Alan Curry

bug#8604: Linux mime help needed

2011-05-02 Thread Alan Curry

Eric Blake writes:
 
 $ file mmencode
 mmencode: PA-RISC2.0 shared executable dynamically linked - not stripped
 
 If you want to run a binary on a different platform, you have to
 recompile it from source for that platform.  Do you have the source for
 mmencode?  If not, then I don't see how you can expect to migrate to a
 different operating system and hardware.  But again, that's outside the
 scope of Coreutils.

For the record, mmencode is found in the metamail package, and also comes
with elm. Source is still findable, even though it's dropped out of the
packaged by the OS distributor level of popularity. (Curious header
watchers will have noticed I'm an elm user. And yep, I used mmencode to
decode the mmencode in the original question.)

The coreutils equivalent is base64(1).

After a rewrite with mutt, the whole script might be a one-liner.

-- 
Alan Curry

bug#8609: $GZIP doesn't mean what you think it means.

2011-05-03 Thread Alan Curry

Let me show you what happens if I try to clone coreutils from git and compile
in the most straightforward way possible:

% git clone git://git.sv.gnu.org/coreutils
Cloning into coreutils...
remote: Counting objects: 151287, done.
remote: Compressing objects: 100% (37539/37539), done.
remote: Total 151287 (delta 113807), reused 150796 (delta 113449)
Receiving objects: 100% (151287/151287), 26.95 MiB | 767 KiB/s, done.
Resolving deltas: 100% (113807/113807), done.
Script started on Tue May  3 00:40:20 2011
% cd coreutils
% ./bootstrap
./bootstrap: Error: '-9' not found

./bootstrap: See README-prereq for how to get the prerequisite programs
%

On seeing that the first time, I immediately knew what happened and worked
around it... then quickly got into trouble trying to git bisect something and
forgot about it. Now I've repeated the process (including getting into
trouble with git bisect but that's for later) and decided that this bug,
though easily worked around, deserves to be reported.

bootstrap wrongly assumes that if there's a GZIP environment variable, it
must contain the pathanme of a gzip program. gzip is a tool we use all the
time, so I would have hoped that GNU developers would have read its
documentation, but apparently not. The GZIP environment variable is used to
pass default options, hence the GZIP=-9 which has been in my environment
for a long time.

If you even tried putting GZIP=/bin/gzip in the environment, you'd find
that gzip no longer works properly, because it acts as if an extra
/bin/gzip was given on the command line... and if you did it as root,
congratulations, you just gzipped your gzip program.

Surely gzip has the authority to define the semantics of the GZIP environment
variable, and bootstrap should not be making the unwarranted (and obviously
untested obviously) assumption that it means something different.

I assume this bug resulted from an over-generalization of the pattern
CC=gcc, MAKE=gmake, ...

In the environment of my current login shell, there are 10 environment
variables with names that (after tr A-Z a-z) are also programs in my PATH.
Of those 10, 3 follow the $GZIP pattern where the value of the environment
variable is a list of options for the command. Another 3 fit the pattern
COMMAND=/path/to/implementation/of/command.

Neither pattern is a reliable predictor of the semantics of an arbitrary
environment variable.

-- 
Alan Curry

bug#8643: 'who' command bug

2011-05-10 Thread Alan Curry

Bob Proulx writes:
 
 ding bat wrote:
  I was using Who to list all users connected to pptpd vpn server with
  maverick 10.10.
  I put natty on the computer and now the who command does not list out the
  vpn users. It only seems to list out local logged in user.
  any thoughts, this issue is killing me.
  I can still do 'last |grep ppp' and get joy, but 'w' and 'who' were very
  nice.
[...]
 I am not an Ubuntu user and do not have a system to test with and so I
 do not know what programs you would be running when you log in with
 Ubuntu's Natty.  You will need to look at your system and determine
 what login manager you are using.  This is probably gdm but might be
 gdm3 but possibly one of several others.  You will need to determine
 what terminal program you are using.  This is probably gnome-terminal
 but possibly one of several others.  Both of those programs either
 should (or should not) be logging user login information to utmp.

Bob apparently doesn't know what pptpd is. Or what VPN means. Or what PPP
is. Or didn't read very carefully.

But he's probably right anyway. The bug is more likely to be in pppd than
anywhere else. It's weird that it would write to wtmp (for last) but not utmp
(for who). Check the config files for recent changes, and if you can't find
the cause, find someplace that gives help with pppd.

An strace of the pppd process during connection setup could be enlightening.

-- 
Alan Curry

bug#8766: Bug in sha1sum?

2011-05-30 Thread Alan Curry

Theo Band writes:
 
 Hi
 
 I'm not sure, but I think I found a bug in sha1sum. It's easy to
 reproduce with any file that contains a backslash (\) in the name:
 echo test  test
 $ sha1sum test
 4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test
 $ mv test 'test\test'
 $ sha1sum 'test\test'
 \4e1243bd22c66e76c2ba9eddc1f91394e57f9f83  test\\test
 
 I expect the file sha1sum to be the same after renaming the file (a
 backslash is prepended to the otherwise correct result).

This result violated my expectations too, but it turns out to be a documented
feature:

 For each FILE, `md5sum' outputs the MD5 checksum, a flag indicating
  a binary or text input file, and the file name.  If FILE contains a
  backslash or newline, the line is started with a backslash, and each
  problematic character in the file name is escaped with a backslash,
  making the output unambiguous even in the presence of arbitrary file
  names.  If FILE is omitted or specified as `-', standard input is read.

(the sha*sum utilities all refer back to md5sum's description)

I better go fix all my scripts that rely on /^[0-9a-f]{32} /

-- 
Alan Curry

bug#8938: make timeout and CTRL-C

2011-06-27 Thread Alan Curry

=?UTF-8?Q?P=C3=A1draig?= Brady writes:
 
 On 26/06/11 20:20, shay shimony wrote:
  all:
  timeout 12 sleep 10
  
  Note there is a tab before timeout 12 sleep 10.
  Then run at same directory where the file is located make and try to press
  CTRL-C.
  
  Notes:
  CTRL-Z works.
  When executing timeout without make CTRL-C works.
  When executing make without timeout CTRL-C works.
 
 Drats,
 
 That because SIGINT is sent by the terminal to the foreground group.
 The issue is that `make` and `timeout` use much the same method
 to control their jobs. I.E. they create their own process group
 so they can terminate all sub-processes.

Are you sure? I see no evidence of that. When I run make with the above
makefile, the processes look like this:

 PPID   PID  PGID   SID TTY TPGID  STAT  UID   TIME COMMAND
1  1451  1451  1451   6 16407  S1000   0:06 -zsh
 1451 16407 16407  1451   6 16407  S1000   0:00 make
16407 16408 16408  1451   6 16407  S1000   0:00 timeout 60 sleep 30
16408 16409 16408  1451   6 16407  S1000   0:00 sleep 30

The first PGID is the login shell. The second PGID is make, which was put
into its own process group by the shell because the shell has job control
enabled. The last PGID is timeout, which put itself into a process group.
make never noticed any of them.

In the source for GNU make 3.82 there are no calls to setpgrp or setpgid
(unless obfuscated from grep). There is the following comment:

  /* A termination signal won't be sent to the entire
 process group, but it means we want to kill the children.  */

That's above the handling of SIGTERM, which iterates over child processes and
passes along the SIGTERM to them.

After that is the handling of SIGINT, which doesn't kill child processes
(unless they're remote, which is... news to me that make does remote
things) but just waits for them.

What seems to be happening is that make *doesn't* create a process group,
therefore assumes that when it gets a SIGINT, its children have already
gotten it too, and it just waits for them to die. A child that puts itself
into a new process group screws this up (as would kill -2 `pidof make`).

I think the answer is that timeout should put itself into the foreground.
That way it would get the SIGINT. make wouldn't get it, but wouldn't need to.
timeout would exit quickly after SIGINT and make would proceed or abort
according to the exit code.

-- 
Alan Curry

bug#8938: make timeout and CTRL-C

2011-06-27 Thread Alan Curry

=?UTF-8?Q?P=C3=A1draig?= Brady writes:
 
 This is a multi-part message in MIME format.
 --03030307000505070101
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 8bit
 
 On 27/06/11 21:12, Alan Curry wrote:
  
  What seems to be happening is that make *doesn't* create a process group,
  therefore assumes that when it gets a SIGINT, its children have already
  gotten it too, and it just waits for them to die. A child that puts itself
  into a new process group screws this up (as would kill -2 `pidof make`).
 
 Thanks for the analysis Alan.
 Yes you're right I think.
 In any case the important point is that timeout sets itself as group leader,
 and is not the foreground group.

Right, we have a tree of process groups that goes roughly
shell-make-timeout and the one in the middle of the tree is the
foreground, receiving tty-based signals.

 
  
  I think the answer is that timeout should put itself into the foreground.
  That way it would get the SIGINT. make wouldn't get it, but wouldn't need 
  to.
  timeout would exit quickly after SIGINT and make would proceed or abort
  according to the exit code.
 
 I've a version locally here actually that calls tcsetpgrp() but I discounted
 that as it's not timeout's place to call that I think.
 timeout sets itself as group leader so that it can kill everything it starts,
 but it shouldn't need to grab the foreground group as the shell (or make)
 may be starting it in the background etc.

It seems like this is a misuse of process groups, using them as if they
were a handle for killing a whole tree of processes. That's not what
they're for. Process groups were invented to support job control, which
means the only program that was supposed to mess with them was csh. Only
the lack of a kill process tree primitive (and the fact that you can't
even query the process tree easily) tempts us into using process groups
as a shortcut.

Any non-job-control-aware parent process will have a problem with
timeout's behavior. We've already seen what GNU make does. pmake simply
dies of the SIGINT and leaves the child processes lingering (it probably
also assumes they got the SIGINT, and doesn't bother waiting for them).

In an interactive shell with job control disabled (set +m in most
Bourne-ish shells), the behavior is not good there either. dash, bash,
and posh all act like GNU make, appearing to ignore the SIGINT. zsh acts
more like pmake, printing a new prompt but leaving the timeout and its
child running.

timeout's pgrp behavior only appears harmless when the parent process is
a shell with job control, which expects its children to be in separate
process groups. But in that case, timeout doesn't need to put itself in
a new process group because the shell has already done so.

So I suggest that if you create a process group, you take on the
responsibility of behaving like a job control shell in other ways,
including managing the foreground group. (An important piece of that is
remembering the original value and restoring it before you exit).

-- 
Alan Curry

bug#8938: make timeout and CTRL-C

2011-06-28 Thread Alan Curry

=?ISO-8859-1?Q?P=E1draig_Brady?= writes:
 
 I'm still not convinced we need to be messing with tcsetpgrp()
 but you're right in that the disconnect between the timeout
 process group and that of whatever starts `timeout` should be bridged.
 
 I'm testing the attached patch at the moment (which I'll split into 2).
 It only creates a separate group for the child that `timeout` execs,
 leaving the timeout process in the original group to propagate signals down.
 
 I'll need to do lots of testing with this before I commit.

With this patch the child is guaranteed to not be in the foreground (as far
as the tty knows) so it will be getting SIGTTIN and possibly SIGTTOU on tty
operations.

I don't think there's anything that will make every scenario happy. (Except
for a recursive-kill that doesn't use pgrps!).

-- 
Alan Curry

bug#8938: make timeout and CTRL-C

2011-06-28 Thread Alan Curry

Bob Proulx writes:
 
 P=E1draig Brady wrote:
  Paul Eggert wrote:
   I'd like to have an option to 'timeout' so that
   it merely calls alarm(2) and then execs COMMAND.
   This would be simple and fast would avoid the problem
   in question.  This approach has its own issues, but
   when it works it works great, and it'd be a nice option.
 
 I agree.  It is nice and simple and well understood.
 
  The main problem with that is would only send the signal to the
  first process, and any processes it started would keep running.
 
 Then that is a problem for that parent process to keep track of its
 own children.  It is a recursive situation.  If all processes are well
 behaved then it works okay.  And if you ask about processes that are
 not well behaved then my response would be to fix them so that they
 are better behaved.

That sounds reasonable, but then if something is about to be killed by
timeout, there's reason to believe it's not behaving well at the moment.

-- 
Alan Curry

bug#8938: make timeout and CTRL-C

2011-07-02 Thread Alan Curry

shay shimony writes:
 
  With this patch the child is guaranteed to not be in the foreground (as far
  as the tty knows) so it will be getting SIGTTIN and possibly SIGTTOU on tty
  operations.
 
 You may need to correct me. In practice we see that the timeouted program
 perform successfully writes to the terminal, though it belongs to a
 different group then the foreground (in my case make's group is in the
 foreground and timeout+compiler/test group is in the background, and all
 output of the compiler and test seem to appear correctly on the terminal).
 And regarding read, I think it makes sense enough that users will not use
 timeout for interactive programs that wait for input from the user.
 So maybe the fact that the timeouted program will not be able to get SIGTTIN
 and SIGTTOU is not such a disaster?
 

Notice that I wrote possibly before SIGTTOU. There was a reason for that.

A background process that writes to the tty will get SIGTTOU if stty tostop
is in effect. This is a user preference thing. You can set it if you get
annoyed by processes writing to the terminal after you backgrounded them
expecting them to be quiet. It's not enabled by default.

If the process ignores SIGTTOU, the write will proceed, overriding the user's
expressed preference.

SIGTTIN is more forceful. There's no stty flag to turn it off, and ignoring
it results in EIO. Keyboard input always belongs exclusively to the
foreground job.

For completeness I'll also mention that SIGTTOU will also be sent to a
background process that attempts to change the tty settings, even if tostop
is not enabled.

-- 
Alan Curry

bug#8938: make timeout and CTRL-C

2011-07-02 Thread Alan Curry

=?ISO-8859-1?Q?P=E1draig_Brady?= writes:
 
 Given the above setsid make example (which hangs for 10s
 ignoring Ctrl-C, I'm leaning towards `make` needing to
 be more shell like, or at least forward the SIGINT etc.
 to the job, and not assume jobs run in the foreground group).

I'm a little worried that you're focusing too much on make, which is just one
way to demonstrate the problems of process group abuse.

This simple shell script:

#!/bin/sh
timeout 12 sleep 10

is also nonresponsive to ^C for the same reason as the original makefile.

Are you going to argue that the shell is doing something wrong there too?

-- 
Alan Curry

bug#9102: timeout 0 FOO should timeout right away

2011-07-16 Thread Alan Curry

Paul Eggert writes:
 
 sleep 0 sleeps for zero seconds, and timeout 0 FOO
 should timeout in zero seconds as well.  Currently,
 it doesn't; it times out in an infinite number of seconds.
 I see why, from the internals (alarm (0) is a special
 call intended to cancel alarms).  However, 'timeout' shouldn't
 be exposing those internals to users; it should behave like
 'sleep' does, as that's more consistent.
 

What's the difference between running a command with a 0 second timeout
and not running the command at all? It could be killed before it even gets
scheduled.

-- 
Alan Curry

bug#9531: md5sum: confusing documentation for file type output

2011-09-17 Thread Alan Curry

=?UTF-8?Q?R=C3=BCdiger?= Meier writes:
 Or in other words you can never validate a ' ' md5sum unless you know
 about the platform which calculated it.

Not exactly. It works transparently if the appropriate translation was done
at the time of the transfer from the original platform to the current
platform (e.g. FTP'ed in text mode, not binary mode).

If you transferred a text file in binary mode, you have something that's
bitwise identical, but semantically different, so md5sum is right to
complain.

Well, at least the above would apply if you believed that having a
text file/binary file distinction is a good idea at all. Which I don't.

-- 
Alan Curry

bug#9620: dd: bogus behavior when interrupted

2011-09-27 Thread Alan Curry

=?UTF-8?Q?P=C3=A1draig?= Brady writes:
 
 BTW that ^C being displayed (started around Fedora 11 time (2.6.30))
 is very annoying, especially when inserted in the middle of an ANSI code.
 I mentioned that previously here:
 http://mail.linux.ie/pipermail/ilug/2011-February/106723.html

I've been annoyed by that too. So annoyed that I patched my kernel to get rid
of it.

It was added between 2.6.24 and 2.6.25. Here's the commit message:

|commit ec5b1157f8e819c72fc93aa6d2d5117c08cdc961
|Author: Joe Peterson j...@skyrush.com
|Date:   Wed Feb 6 01:37:38 2008 -0800
|
|tty: enable the echoing of ^C in the N_TTY discipline
|
|Turn on INTR/QUIT/SUSP echoing in the N_TTY line discipline (e.g.  ctrl-C
|will appear as ^C if stty echoctl is set and ctrl-C is set as INTR).
|
|Linux seems to be the only unix-like OS (recently I've verified this on
|Solaris, BSD, and Mac OS X) that does *not* behave this way, and I really
|miss this as a good visual confirmation of the interrupt of a program in
|the console or xterm.  I remember this fondly from many Unixs I've used
|over the years as well.  Bringing this to Linux also seems like a good way
|to make it yet more compliant with standard unix-like behavior.
|
|[a...@linux-foundation.org: coding-style fixes]
|Cc: Alan Cox a...@lxorguk.ukuu.org.uk
|Signed-off-by: Andrew Morton a...@linux-foundation.org
|Signed-off-by: Linus Torvalds torva...@linux-foundation.org

And here's what I use to kill it (committed to my own git tree which is
exported to no one and has been seen by nobody but me until now):

commit 0b76f0a49a52ac37fb220f1481955426b6814f86
Author: Alan Curry pac...@kosh.dhis.org
Date:   Wed Sep 22 16:35:01 2010 -0500

The echoing of ^C when a process is interrupted from tty may be more like
what the real unixes do, but this is a case where Linux was better. Put it
back the way it was.

When a command's output ends with an incomplete line, the shell can do one
of two things, both of them bad: engage its command line editor with the
cursor in the wrong column, or force the cursor to the first column before
printing the prompt, which obliterates the incomplete line, hiding actual
program output.

The echo of ^C immediately followed by process death is an instance of this
generally bad command output ends with incomplete line behavior.

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index c3954fb..70f5698 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1194,10 +1194,12 @@ send_signal:
}
if (I_IXON(tty))
start_tty(tty);
+#if 0 /* This echoing is a sucky new feature. --Pac. */
if (L_ECHO(tty)) {
echo_char(c, tty);
process_echoes(tty);
}
+#endif
if (tty-pgrp)
kill_pgrp(tty-pgrp, signal, 1);
return;

-- 
Alan Curry

bug#9788: chown gets permission denied

2011-10-18 Thread Alan Curry

Richard Woolley writes:
 
 When trying to change ownership of the files in a directory, I mistakenly h=
 ad the settings wrong in the command, so I got the following
 ls -l
 total 16
 drw-rw-r--  4  user proj1 4096 Sep 28 14:23 doc/
 drw-rw-r-- 24 user proj1 4096 Sep 28 14:27 modules/
 drw-rw-r--  3 user proj1 4096 Sep 28 14:23 project/

Your first problem is that you've got some directories here with read
permission but no x permission. In that situation, this happens:

 ls -l project
 total 0
 ?- ? ? ? ?? compile.conf
 ?- ? ? ? ?? myproject.conf
 ?- ? ? ? ?? novas_fli.so

ls can read the directory, getting the filenames, but the lack of x
permission prevents it from getting any other information.

First chmod u+x doc modules project, then see what you get from ls -l on
them.

-- 
Alan Curry

bug#9939: Problems with the SIZE description in man pages for ls

2011-11-09 Thread Alan Curry

abdallah clark writes:
 
 I was wondering if you received my very detailed account of the issues
 I found with the ls -l --block-size=3DSIZE command. It's been about a
 week since I sent it, so I wasn't sure what was happening.

I looked over that message and prepared a reply explaining the things that
you had misunderstood. Then I tried running your examples and realized that I
didn't understand some of them either. According to my understanding, several
of the behaviors you observed are bugs. So I deleted my reply and decided to
wait along with you for someone else to explain it all.

Since that hasn't happened yet, I'll go ahead and cover the main point:

You're interested in altering the block size used in the ls output, but you
haven't investigated what portions of the output are affected by block size.
There are 3 instances of the word block in ls(1).

2 of them are in the description of the options that change the block size:
--block-size and -k.

The 3rd instance is under the only option that actually makes use of the
block size: -s.

A quick demonstration of -k working. First I have to set POSIXLY_CORRECT
because the default block size when not in POSIXLY_CORRECT mode is already
1K, so -k is normally a no-op.

$ POSIXLY_CORRECT=1 ; export POSIXLY_CORRECT
$ ls -s /bin/ls
224 /bin/ls
$ ls -sk /bin/ls
112 /bin/ls

Since the -l output is not defined in terms of block size, ls -l and ls -lk
will produce exactly the same output.

$ ls -l /bin/ls
-rwxr-xr-x 1 root root 107124 Feb  8  2011 /bin/ls
$ ls -lk /bin/ls
-rwxr-xr-x 1 root root 105 Feb  8  2011 /bin/ls

Oops.

Well, I know they used to produce the same output. And I think they still
should and this is a bug. Anyone?

 
 On Wed, Nov 2, 2011 at 11:01 AM, Paul Eggert egg...@cs.ucla.edu wrote:
[snip]
Quote what you're replying to, and put your reply in logical order with it.

-- 
Alan Curry

bug#10016: ls -lk is wrong

2011-11-10 Thread Alan Curry

I mentioned this already in the bug#9939 thread, but nobody replied and it's
really a separate issue so here's an independent report.

This behavior:

$ ls -l /bin/ls
-rwxr-xr-x 1 root root 107124 Feb  8  2011 /bin/ls
$ ls -lk /bin/ls
-rwxr-xr-x 1 root root 105 Feb  8  2011 /bin/ls

is awful. -k should not have any effect on the ls -l field that reports
st_size. It is only supposed to possibly affect the reporting of st_blocks
by -s and the total line at the start of a full directory listing.

I won't make any claims about what --block-size should do, but -k comes from
BSD and it should act like BSD.

-- 
Alan Curry

bug#10016: ls -lk is wrong

2011-11-11 Thread Alan Curry

Jim Meyering writes:
 
 I'm thinking of making -k comply, but letting any block-size
 specification (via --block-size= or an envvar) override that
 to give the behavior we've seen for the last 9 years.
 

Wow, look what I stirred up.

If it's been like this for 9 years, it's been broken for 9 years. As I said
originally, BSD is the standard that matters here. It doesn't matter when or
even whether POSIX blessed the -k option.

Everywhere except GNU, this is simple. The size field of the ls -l output is
not defined in terms of blocks, so the block size setting doesn't affect it.

Numbers derived from st_blocks are reported in units of blocks, and others
aren't.

If you're going to define --block-size to have this effect, then you really
need to document it as being an option that does 2 separate things:
  1. sets the size of a block
  2. alters the definition of the -l format

-- 
Alan Curry

bug#10021: [PATCH id] Add error-checking on GNU

2011-11-13 Thread Alan Curry

Ludovic =?UTF-8?Q?Court=C3=A8s?= writes:
 
 OTOH, on POSIX-conforming systems (which includes GNU/Linux, so it may
 be the majority of systems in use), -1 may well be a valid UID/GID.

That's a bizarre statement.

  3.428 User ID

  A non-negative integer that is used to identify a system user. When the
  identity of a user is associated with a process, a user ID value is
  referred to as a real user ID, an effective user ID, or a saved
  set-user-ID.

chown(2) uses (uid_t)-1 and (gid_t)-1 as the don't change special values.
So does setreuid(2)/setregid(2).

setuid(-1) isn't documented as special. Trying it out, it seems to be treated
as equivalent to setuid(1). Not what I expected, but it doesn't really
support your -1 is a valid uid theory.

-- 
Alan Curry

bug#10136: Can't view some strange characters in some of the man pages

2011-11-25 Thread Alan Curry

Harold Raulston writes:
 
 Hi,
 
 Could you tell me what encoding I need to use to view your man pages?
 
 I've tried Unicode, Western, Western ISO, but still get some unreadable
 characters in the EXAMPLES (I've just looked at the find and du commands so
 far):
 
 =C3=A2=E2=82=AC=C3=A2=E2=82=AC=E2=84=A2 linuxcommand find1 can't display re=
 ad
 
 BTW, I'm using Win7 Pro English, IE9. All latest updates.
 I have the same problem in Chrome...

man pages are read with the man program. HTML is Not The Way.

[c3 a2 e2 82 ac c3 a2 e2 82 ac e2 84 a2] is what you get when you start with
U+2019 RIGHT SINGLE QUOTATION MARK in UTF8, then misinterpret it as
windows-1252 and convert it to UTF8 again.

We were *so* unfortunate when we didn't have all these extra kinds of
quotation marks.

-- 
Alan Curry

bug#10281: change in behavior of du with multiple arguments (commit

2011-12-14 Thread Alan Curry

Paul Eggert writes:
 
 Perhaps this is a bug in POSIX, of course, but there is a
 good argument for why GNU du behaves the way it does: you get
 useful behavior that you cannot get easily with the Solaris
 du behavior.
 

Remind us again... the useful behavior is that du -s returns a column of
numbers next to a column of names, and the numbers don't necessarily have any
individual meaning relevant to the adjacent names, but you can add them up
manually and get something that is correct total for the group.

Meanwhile if you wanted the total for the group you would have used -c and
not had to add them up manually.

Why not let the -c total be correct *and* the -s individual numbers also be
correct for the names they are next to? Like this:

$ mkdir a b ; echo hello  a/a ; ln a/a b/b ; du -cs a b
8   a
8   b
12  total

The fact that the numbers on the left don't add up means there is less
redundancy in the output. Each number actually tells me something you can't
derive from the others. There is higher information content. This is good,
not bad.

-- 
Alan Curry

bug#10281: change in behavior of du with multiple arguments (commit

2011-12-16 Thread Alan Curry

Paul Eggert writes:
 
 For example, suppose I have a bunch of hard links
 that all reside in three directories A, B, and C,
 and I want to find out how much disk space
 I'll reclaim by removing C.  (This is a common situation
 with git clones, for example.)  With GNU du, I can run
 du -s A B C and the output line labeled C will tell
 me how much disk space I'll reclaim.  There's no easy way
 to do this with Solaris du.

The straightforward method would be to simply the directory you intend to
remove and keep track of the discrepancy between st_nlink and how many links
you've seen.

I admit that this straightforward method isn't implemented in any standard
tool, but your way involves extra work by both du, which must traverse all
the other directories which might share files with the target directory; and
the user, who must somehow amass that list of directories ahead of time. As a
creative improvised use of pre-existing tools it's a good example, but as a
justification for an intentional feature, it's just too inefficient.

-- 
Alan Curry

bug#10281: change in behavior of du with multiple arguments (commit

2011-12-17 Thread Alan Curry

Paul Eggert writes:
 
 On 12/16/11 18:36, Alan Curry wrote:
  The straightforward method would be to simply the directory you intend to
  remove and keep track of the discrepancy between st_nlink and how many links
  you've seen.
 
 Sorry, I can't parse that.  But whatever it is, it sounds like you're
 talking about what one could do with a program written in C, not with
 either GNU or Solaris du.

Yes, I'm saying that du is just not the tool for this job, although you've
managed to twist it to fit.

The predict free space after rm -rf foo operation can be done without
searching other directories and without requiring the user to specify a list
of other directories that might contain links. What you do with du is kludgy
by comparison.

[...]
 Of course I'd never want to do that in an actual link farm: it's tricky
 and brittle and could mess up currently-running builds.  But the point is that
 GNU du is not being inefficient here, any more than Solaris du is.
 

By comparison to a proper tool which doesn't do any unnecessary traversals of
extra directories, your use of du is slow and brittle (if the user forgets
an alternate directory containing a link, the result is wrong) and has only
the slight advantage of already being implemented.

Here's a working outline of the single-traversal method. I wouldn't suggest
that du should contain equivalent code. A single-purpose perl script, even
without pretty output formatting, feels clean enough to me. Since I've gone
to the trouble (not much) of writing it, I'll keep it as ~/bin/predict_rm_rf
for future use.

#!/usr/bin/perl -W
use strict;
use File::Find;

@ARGV or die Usage: $0 directory [directory ...]\n;

my $total = 0;
my %pending = ();

File::Find::find({wanted = sub {
  my ($dev,$ino,$nlink,$blocks) = (lstat($_))[0,1,3,12];
  if(-d _ || $nlink==1) {
$total += $blocks;
return;
  }
  if($nlink == ++$pending{$dev.$ino}) {
delete $pending{$dev.$ino};
$total += $blocks;
  }
}}, @ARGV);

print $total blocks would be freed by rm -rf @ARGV\n;
__END__

-- 
Alan Curry

bug#10349: tail: fix --follow on FhGFS remote file systems

2011-12-22 Thread Alan Curry

Bob Proulx writes:
 
 Jim Meyering wrote:
  Are there so many new remote file systems coming into use now?
  That are not listed in /usr/include/linux/magic.h?
 
 The past can always be enumerated.  The future is always changing.  It
 isn't possible to have a complete list of future items.  It is only
 possible to have a complete list of past items.  The future is not yet
 written.

Between past and future is the present, i.e. the currently running kernel.
Shouldn't it return an error when you use an interface that isn't implemented
by the underlying filesystem? Why doesn't this happen?

-- 
Alan Curry

bug#10355: Add an option to {md5,sha*} to ignore directories

2011-12-23 Thread Alan Curry

Bob Proulx writes:
 
 severity 10355 wishlist
 tags 10355 + notabug wontfix moreinfo
 thanks
 
 Erik Auerswald wrote:
  Gilles Espinasse wrote:
  I was using a way to check md5sum on a lot of file using
for myfile in `cat ${ALLFILES}`; do if [ -f /${myfile} ]; then md5sum
  /$myfile  $ALLFILES}.md5; fi; done
 ...
  You could use find $DIR -type f to list regular files only.
 
 Yes.  Exactly.  The capability you ask for is already present.

Do you suppose we can convince GNU grep's maintainer to follow this
philosphy?

$ mkdir d
$ touch d/foo
$ grep foo *
$

It opens and reads, gets EISDIR, and intentionally skips printing it. Grr.

But wait, there's a -d option with 3 alternatives for what to do with
directories! ...and none of choices is just print the EISDIR so I'll know
if I accidentally grepped a directory.

-- 
Alan Curry

bug#10363: /etc/mtab - /proc/mounts symlink affects df(1) output for

2011-12-26 Thread Alan Curry

jida...@jidanni.org writes:
 
   Filesystem 1K-blocksUsed 
 Available Use% Mounted on
   rootfs   1071468  287940
 729100  29% /
   /dev/disk/by-uuid/551e44e1-2cad-42cf-a716-f2e6caf9dc78   1071468  287940
 729100  29% /

(I'm replying only on the issue of the duplicate mount point. Someone else
can tackle the long ugly name.)

The one with rootfs as its device is the initramfs which you automatically
get with all recent kernels. Even if you aren't using an initramfs, there's
an empty one built into the kernel which gets mounted as the first root
filesystem. The real root gets mounted on top of that.

So this is a special case of a general problem with no easy solution: What
should df do when 2 filesystems are mounted at the same location? It can't
easily give correct information for both of them, since the later mount
obscures the earlier mount from view.

If there's a way for df to get the correct information for the lower mount, I
don't know what it would be. If you have a process with a leftover cwd or
open fd in the obscured filesystem, you can use that. But generally you
won't.

But maybe we could do better than reporting incorrectly that the lower mount
has size and usage identical to the upper mount! At least df could print a
warning at the end if it has seen any duplicate entries. Perhaps there is
some way it could figure out which one is on top, and print a bunch of
question marks as the lower mount's statistics.

If df is running as root, it might be able to unshare(2) the mount namespace,
unmount the upper level, and then statfs the mount point again to get the
correct results for the lower level. That won't work in all cases (even in a
private namespace you can't unmount the filesystem containing your own cwd)
and it does nothing for you if you're not root, but still... it would be a
cool bonus in the cases where it does work.

As a special case, rootfs should probably be excluded from the default
listing, since the initramfs is not very interesting most of the time. It
could still be shown with the -a option, although it would always have the
wrong statistics. Or if you really want to be impressive, default to showing
the initramfs if and only if it is the only thing mounted on / - so you can
run df within the initramfs before the real root is mounted and get the right
result.

Or... (brace yourself for the most bold idea yet)... can you imagine a kernel
interface that would *cleanly* give access to obscured mount points?

Comments on any of the above? Do the BSDs have any bright ideas we can steal,
or is their df as embarrassingly bad at handling obscured mount points as
ours?

-- 
Alan Curry

bug#10456: bug in du

2012-01-08 Thread Alan Curry

Lubomir Mateev writes:
 
 root@thor:/# fdisk -l
 
 Disk /dev/hda: 15.0 GB, 15000330240 bytes
...
 9.5T/usr/lib

I'm going to guess filesystem corruption causing a file in /usr/lib (not a
subdirectory) to have the wrong block count. Do ls -Ssr /usr/lib and see if
you get a big surprise at the end. unmount and fsck it to fix if I'm right.

-- 
Alan Curry

bug#11246: Is this a bug in tee?

2012-04-15 Thread Alan Curry

Adrian May writes:
 
 ad@pub:~/junk$ echo abcde | tee (tr a 1) | tr b 2
 a2cde
 12cde
 
 I'd have expected 1bcde instead of 12cde. It seems like the tr b 2 is
 acting early on the stream going into tr a 2.
 
 This is a ubuntu server 10.04 machine.
 
 Adrian.
 

The shell sets up the pipeline, and your shell is doing it stupidly. With
zsh, you'd get the correct result:

% echo abcde | tee (tr a 1) | tr b 2
a2cde
1bcde

ksh and bash recently copied the process substitution feature from zsh, and
they haven't got it right yet.

-- 
Alan Curry

bug#11667: problem with command date

2012-06-10 Thread Alan Curry

amanda sabatini writes:
 
 Hi,
 
 The follow command does not work with the specifics date: 1986-10-25;
 1987-10-25; 1989-10-15; 1992-10-25; 1991-10-20; 1995-10-15; 2006-11-05.
 
 date +%d --date=1986-10-25

The date command never actually works on dates alone. There is always a time
attached to its calculations, even when it's not necessary for the output
format you requested.

When you don't specify a time with the --date option, the command guesses
that you meant 00:00:00. That turns out to be a bad guess in this case, since
00:00:00 didn't exist on those days.

All of those are dates on which the Brazil/East time zone shifted into
daylight savings time, jumping from 23:59:59 the previous day to 01:00:00 on
the day you mentioned.

You can avoid this problem by adding 12:00:00 to the requested date.

$ TZ=Brazil/East date +%d --date=1986-10-25
date: invalid date `1986-10-25'
$ TZ=Brazil/East date +%d --date=1986-10-25 12:00:00
25

-- 
Alan Curry

bug#11950: cp: Recursively copy ordered for maximal reading speed

2012-07-16 Thread Alan Curry

Michael writes:
 
 Hello,
 
 After coding several backup tools there's something in my mind since years. 
 When 'cp' copies files from magnetic harddisks (commonly called after their 
 adapter or bus - SATA, IDE, and the like, i'm not talking about solid state) 
 recursively, it seems to pick up the files in 'raw' order, just as the disk 
 buffer spit them out (like 'in one head move'). Or so. It does not resemble 
 any alphabetical order, for example, it does not even stay within the same 
 parent folder (flingering hither and forth, as the files come in).

[grumble at User-Agent: claws-mail.org: One line per paragraph isn't good
mail formatting!]

It's called directory order. It used to be simply order of creation of
files, with deletions creating gaps that could be filled by later
creations with same-length or shorter names.

But on most new filesystems, directories are stored in a non-linear
structure so that lookups in a large directory don't have to scan
through every name. For ext2/ext3/ext4, run tune2fs -l on the block
device and look for the dir_index option.

If you're copying files onto a filesystem with dir_index enabled, the
order in which cp creates them should have little effect on the
directory's layout afterward. If you're not using dir_index on the
destination filesystem, there's your problem! Enable dir_index and all
directory lookups will be fast.

None of this has anything to do with where the actual data blocks of the
file will be allocated. There's no way to control that. If you think
that the second file created is going to be adjacent to the first file
created... that's never been guaranteed. Filesystem block allocators are
way more mysterious than that.

If you really think there's something to be gained here, prove it: start
with a directory with a lot of files but no subdirectories. Do an
alphabetical-order copy like this:

$ mkdir other_directory ; cp ./* other_directory

(The glob returns the names in sorted order so this gives you the
creation order you want, unlike cp -r)

Then get it all out of cache so the read test will hit the disk as much
as possible:

$ sync ; echo 3  /proc/sys/vm/drop_caches

And read back the files:

$ cd other_directory ; time cat ./*  /dev/null

Now repeat, but using cp -r to create the other directory so the files
get copied in the source directory order. And repeat again, but using

$ find . -type f -exec cat '{}' +  /dev/null

instead of the cat ./* (the glob will cat the files in sorted order, the
find will use directory order).

If there are any significant differences in the times, and dir_index is
enabled, you're onto something. With dir_index disabled, you should get
worse times all around, but not a lot worse if the files are big enough
that the time spent reading their contents overshadows the time spent on
directory lookups.

-- 
Alan Curry

bug#12019: join command - wrong column moved to start of line with

2012-07-21 Thread Alan Curry

Eric Blake writes:
 
 On 07/21/2012 12:20 PM, Jean-Pierre Tosoni wrote:
  Hello Maintainer,
 =20
  I am using join v8.5 from debian squeeze.
 =20
 
  now, the command:
 join   -v 2   -1 2   -2 3   a   b
  produces
  =3D=3D=3D=3D wrong output =3D=3D=3D=3D
  zzz222 zzz111 keyZ zzz333
 
 I tried reproducing this with coreutils 8.17:
 
 $ cat a b
 axx111 keyX axx222
 ayy111 keyY ayy222
 xxx111 xxx222 keyX xxx333
 zzz111 zzz222 keyZ zzz333
 $ join -v2 -1 2 -2 3 a b
 keyZ zzz111 zzz222 zzz333
 
 but I get the expected order.  I don't see a specific mention of a fix
 for this in NEWS, so I have to wonder if this might be a bug in a
 debian-specific patch.  Can you do some more investigating, such as
 compiling upstream coreutils to see if the problem still persists for you=
 ?

It's not a Debian-specific problem. I can reproduce the bug with unaltered
coreutils 8.9. It was apparently fixed by accident as a side effect of some
other work on the join program.

commit d4db0cb1827730ed5536c12c0ebd024283b3a4db
Author: PÃ¡draig Brady p...@draigbrady.com
Date:   Wed Jan 5 11:52:54 2011 +

join: add -o 'auto' to output a constant number of fields per line

d4db0cb1827730ed5536c12c0ebd024283b3a4db can be cherry-picked and applied to
older coreutils to fix the bug. I tested this with upstream 8.9 and Debian's
8.5, both applied with fuzz but worked correctly.

-- 
Alan Curry

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is

2012-09-03 Thread Alan Curry

Bob Proulx writes:
 
 Jim Meyering wrote:
  Could you be thinking of some other rm?
  Coreutils' rm has rejected that for a long time:
  ...
  POSIX requires rm to reject any attempt to delete an explicitly specified
  . or .. argument (or any argument whose last component is one of those):
 
 Hmm...  Wow.  I decided to check HP-UX 11.11, a now rather old release
 from twelve years ago in 2000, the oldest easily available to me, and
 got this:
 
   $ /usr/bin/rm -rf .
   rm: cannot remove .. or .
 
 So I guess GNU coreutils is in good company with traditional Unix
 systems!  It has definitely been that way for a long time.

Linux has the ability to actually remove a directory that is empty but still
referenced as the cwd of some process. This ability is non-traditional
(my fuzzy memory says it showed up some time in the 2.2 or 2.4 era). It's
worth considering whether this change should be reflected by a relaxation of
rm's traditional behavior.

rm -rf $PWD, meaning basically the same thing as rm -rf ., works, and leaves
you in a directory so empty that ls -a reports no . or .. entries, and no
file can be created in the current directory. (open and stat and chdir still
work on . and .. though. They're magic.)

-- 
Alan Curry

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is

2012-09-03 Thread Alan Curry

Jim Meyering writes:
 
 Alan Curry wrote:
 
  rm -rf $PWD, meaning basically the same thing as rm -rf ., works, and leaves
 
 If you use that, in general you would want to add quotes,
 in case there are spaces or other shell meta-characters:
 
 rm -rf $PWD

Well, when I do it I'm in zsh which has fixed that particular Bourne shell
design error.

-- 
Alan Curry

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Alan Curry

Eric Blake writes:
 
 Indeed, reading the original V7 source code from 1979:
 http://minnie.tuhs.org/cgi-bin/utree.pl?file=3DV7/usr/src/cmd/rm.c
 
[...]
 
 shows that _only_ .. was special, . was attempted in-place and
 didn't fail until the unlink(.) after the directory itself had been
 emptied.  It wasn't until later versions of code that . also became
 special.

I also decided to look around there, and found some of the turning points:

Up to 4.2BSD, the V7 behavior was kept.
(http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/bin/rm.c)

rm -rf . was forbidden in 4.3BSD (26 years ago).
http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD/usr/src/bin/rm.c

The removal of dir/. (and dir/..) was not forbidden until Reno.
http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/bin/rm/rm.c
cp = rindex(arg, '/');
if (cp == NULL)
cp = arg;
else
++cp;
if (isdot(cp)) {
fprintf(stderr, rm: cannot remove `.' or `..'\n);
return (0);
}

Maybe the classical behavior stuck around longer in the more SysV-ish Unices.
The Ultrix-11 3.1 tree on TUHS from 1988 has a rm that looks very much like
V7, but I can't find anything to compare it to until OpenSolaris.

Did POSIX force BSD to change their rm in 1988? I think it's more likely that
POSIX simply documents a restriction that BSD had already added. Either way
the latest POSIX revisions certainly can't be blamed.

-- 
Alan Curry

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Alan Curry

Linda Walsh writes:
 
 So far no one has addressed when the change in -f' went in
 NOT to ignore the non-deletable dir . and continue recursive delete,

In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the -f
option is not consulted before rejecting removal of . so I don't think the
change you're referring to is a change at all. -f never had the effect you
think it should have.

-- 
Alan Curry

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Alan Curry

Linda Walsh writes:
 
 Alan Curry wrote:
  Linda Walsh writes:
  So far no one has addressed when the change in -f' went in
  NOT to ignore the non-deletable dir . and continue recursive delete,
  
  In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the 
  -f
  option is not consulted before rejecting removal of . so I don't think the
  change you're referring to is a change at all. -f never had the effect you
  think it should have.
  
 If I was using BSD, I would agree.
 ---
 But most of my usage has been on SysV compats Solaris, SGI, Linux, a short
 while on SunOS back in the late 80's, but that would have been before it
 changed anyway.

SGI is dead, Sun is dead, the game's over, we're the winners, and our rm has
been this way forever.

 
 For all i know it could have been a vendor addin, but that's
 not the whole point here.
 
 Do you want to support making . illegal for all
 gnu utils for addressing content?

I don't think addressing content is a clearly defined operation, no matter
how many times you repeat it.

Consistency between tools is a good thing, but consistency between OSes is
also good, and we'd be losing that if any change was made to GNU rm's default
behavior. Even OpenSolaris has the restriction: see lines 160-170 of
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/rm/rm.c

 
 I think you'll find many more people against the idea and wondering
 why it's in 'rm' and why -f doesn't really mean ignore all the errors
 it can and why that one should be specially treated.  Of course they also 
 might
 wonder why rm doesn't follow the necessary algorithm for deleting files --
 and delete contents before dying issuing an error for being unable to delete
 a parent.  Which might also raise why -f shouldn't be usable to silence 
 permission
 or access errors as it was designed to.

Look, I agree isn't not logical or elegant. But we have a standard that all
current Unices are obeying, and logic and elegance alone aren't enough to
justify changing that.

A new option that you can put in an alias is really the most realistic goal.

-- 
Alan Curry

bug#12421: Issue of the cp on Ubuntu 10.10

2012-09-14 Thread Alan Curry

owen.z...@alitech.com writes:
 
 Dear Sir,
 
 A strange issue happens when I use the cp tool on two directory.

[snip - summary: after recursive cp, some file has the wrong size]

I don't have any good ideas about the cause of this problem, but since I
didn't see anyone else replying, I'll suggest some investigation techniques.

Run cp --version so we know how far back in history we should look for
similar bugs.

cmp the two versions of the file to see if the short one is just truncated,
or if there are other differences.

Run df -T on the source and destination.

If it's reproducible, run strace -o cptrace cp ... and publish the cptrace
for others to look at. (If the files being copied are private, the names and
contents will be in the trace so you will have to inspect it yourself.)

-- 
Alan Curry

bug#12478: cat SEGV when I press Ctrl-Alt-SysRq-1 on text console

2012-09-20 Thread Alan Curry

Rafal W. writes:
 
  $ cat /dev/zero
  ^\Quit (core dumped)
 
  Steps to reproduce:
  1. Switch to any text console (it doesn't happen in X).
  2. Login
  3. Run: cat /dev/zero
  4. Press: Ctrl-Alt-SysRq-1 (or any number except letters:)

What's that supposed to do? Ctrl isn't normally used with SysRq.

  5. You'll see: ^\Quit (core dumped)

The ^\ character generates a QUIT signal (the same way ^C generates INT), and
death with core dump is the default response to SIGQUIT. Ctrl-4 is an
alternate way of typing Ctrl-\ so this is all perfectly normal for a key
combination involving Ctrl and 4.

By adding SysRq into the mix I don't know what exactly you accomplished.
Maybe you confused the keyboard. Most keyboards don't have every key wired
separately, and weird combinations can send events for keys that weren't
pressed.

To investigate further, try running 'stty -isig' to disable signal
generation, then 'cat /dev/null' or maybe 'od -c' and type your key
combinations. Ctrl-D should still work for EOF to get you out, which is not a
signal so it's not disabled by stty -isig.

bug#12494: 0 exit status even when chmod fails

2012-09-24 Thread Alan Curry

Georgiy Treyvus writes:
 
 Finally I had him show me the mount options of the relevant partitions. 
 Many I recognized. Some I did not. I started researching those I did 

Did you notice this one?:

Mount options for fat
   (Note:  fat  is  not  a  separate  filesystem, but a common part of the
   msdos, umsdos and vfat filesystems.)

[...]
   quiet  Turn on the quiet flag.  Attempts to chown or chmod files do not
  return errors, although they fail. Use with caution!

If you're getting the quiet behavior without the quiet mount option, I'd say
that's a kernel bug.

-- 
Alan Curry

bug#12494: 0 exit status even when chmod fails

2012-09-24 Thread Alan Curry

Sven Joachim writes:
 
 On 2012-09-24 08:37 +0200, Alan Curry wrote:
 
  Georgiy Treyvus writes:
  
  Finally I had him show me the mount options of the relevant partitions. 
  Many I recognized. Some I did not. I started researching those I did 
 
  Did you notice this one?:
 
  Mount options for fat
 (Note:  fat  is  not  a  separate  filesystem, but a common part 
  of the
 msdos, umsdos and vfat filesystems.)
 
  [...]
 quiet  Turn on the quiet flag.  Attempts to chown or chmod files do 
  not
return errors, although they fail. Use with caution!
 
  If you're getting the quiet behavior without the quiet mount option, I'd say
  that's a kernel bug.
 
 Actually, it's the default unless you're using Linux 2.6.25.  This
 kernel reported an error to the caller, but since that broke rsync[1,2],
 2.6.26 reverted to the previous behavior of silently ignoring chmod
 attempts which do not work on FAT filesystems[3].
 
 This bug report should probably be closed.

If the mount man page disagrees with the kernel, it's still a bug in the man
page at least.

(Also, the rest of the world needs to work around extra stupidity because of
rsync?)

-- 
Alan Curry

bug#12478: cat SEGV when I press Ctrl-Alt-SysRq-1 on text console

2012-10-01 Thread Alan Curry

Rafal W. writes:
 
 But if Control-4 is sending QUIT signal, why:
 Control-1 does kill the process?
 I've checked again and actually it's not even about the number.
 When I press only: Control-SysRq it kills the process as well.
 Sometimes it happens on press, sometimes on release.

Is your SysRq key also the PrtSc key? It will be if your keyboard is a
descendant of the IBM PC/AT design. With Alt, it's the SysRq key. Without
Alt, it's the PrtSc key. So if your Control-Sysrq combination doesn't include
Alt, then it's really Control-PrtSc and you should call it that instead of
Control-Sysrq which just confusing.

For other keys, the interpretation of modifiers (including Alt) is done in
software. The PrtSc/SysRq key is the only one in which a distinction is made
in hardware. PrtSc and SysRq are different scancodes. This specialness
probably influenced the decision to use SysRq as a magic key for talking to
the Linux kernel.

Now, on to why you got your SIGQUIT. Well, the default keymap for the Linux
console generates ^\ when you press PrtSc. That's not a reason, that's just a
fact. I don't know the reason. The Ctrl-4 thing is, I believe, a matter of
accurate vt100 emulation. At least it's part of a neat pattern. Ctrl-2
through Ctrl-8 generate all the control codes that aren't ^A through ^Z
alphabeticals, in numerical order:

  key byte   echoprt  ASCII name
  Ctrl-2  0  ^@   NUL
  Ctrl-3  27 ^[   ESC
  Ctrl-4  28 ^\   FS
  Ctrl-5  29 ^]   GS
  Ctrl-6  30 ^^   RS
  Ctrl-7  31 ^_   US
  Ctrl-8  127^?   DEL

Notice that one of them, Ctrl-6 for ^^ actually makes sense. The Ctrl-^ is
Ctrl-Shift-6 after all. Perhaps the others were simply built around that one
as a logical extension.

Oops, I got sidetracked. Why does PrtSc generate ^\ on the Linux console? I
don't know. Looking at the historical source code, it seems that it has been
this way since Linux-0.99.10 (June 7, 1993), in which the keyboard driver was
massively overhauled to support loadable keymaps. In 0.99.9 there is this:

/* Print screen key sends E0 2A E0 37 and puts
   the VT100-ESC sequence ESC [ i into the queue, ChN */
puts_queue(\033\133\151);

So in conclusion, the PrtSc ^\ mapping snuck in as part of a large patch that
wasn't supposed to change any defaults, but did. Accident... or sabotage?
Insert your conspiracy theory here. History says Risto Kankkunen did the
loadable keymap patch, so that's who to blame. ChN appears to be:

 * Some additional features added by Christoph Niemann (ChN), March 1993

Whatever the reason behind this annoying ^\, fixing it isn't hard:

# It's too easy to hit PrtSc by accident. mapping it to ^\ hurts!
loadkeys 'EOF'
keycode 99 = VoidSymbol
EOF

I've had that in my system startup for a long time. Actually it's a bit more
complicated since I have a few other keys I like to remap, but the comment is
exactly as I wrote it at least 15 years ago. (I don't hit PrtSc by accident
much since I got my Happy Hacking keyboard!)

VoidSymbol makes the key do nothing at all when pressed. I suppose you could
map it to ESC [ i like it used to be in 0.99.9 if you feel like you must
right this historical wrong.

The remapping has no effect on the usability of the magic SysRq functions,
because they magically bypass the remapping table.

bug#12478: cat SEGV when I press Ctrl-Alt-SysRq-1 on text console

2012-10-01 Thread Alan Curry

Rafal W. writes:
 
 So in example if I want to check all currently held Locks with SysRq-D
 (which doesn't work anyway), so:
 When I press SysRq-D, I've KSnapshot popping up. In the text console
 it doesn't work at all.

ksnapshot sounds like something that might respond to a PrtSc keypress. This
is a sign that you aren't using Alt, so what you've really done is PrtSc-D.
Didn't I tell you already to stop using SysRq to descibe key combinations
that don't include Alt? WITHOUT ALT IT IS NOT A SYSRQ KEY. Got that yet?
Reread it until you do.

 When I press Control-SysRq-D, my session is getting logout.

Well, Ctrl-D is EOF and PrtSc+D is a meaningless combination (as meaningless
as pressing D and Q at the same time, it's anyone's guess which will take
precedence)

 When I press Control-Alt-SysRq-D my processes are killed.

Too many keys there, I can't guess what they're all doing. Get rid of the
Control. And make sure your kernel has CONFIG_LOCKDEP, otherwise the Sysrq+D
function is disabled.

Also, based on the Subject line, you think SEGV is a synonym for core dump.
Stop thinking that. Nothing segfaulted. SIGSEGV is one of many signals that
can cause a core dump. SIGQUIT is another one.

-- 
Alan Curry

bug#13912: Feedback on coreutils 8.13

2013-03-09 Thread Alan Curry

 the generic --with-PACKAGE and --without-PACKAGE lines are
included in the help output. Their presence seems to imply that the list of
valid values for PACKAGE is open-ended, but then you immediately get a
complete list. I think the help would be more helpful if those top 2 lines
were deleted.

The same goes for the enable/disable section. (I also think the distinction
between with/without and enable/disable is something that isn't helpful to
anyone but the people maintaining the autoconf scripts. If there was an upper
layer that made --enable an alias for --with and --disable an alias for
--without, the users would probably be grateful for it.)

So, the specific things you tried were wrong because:

 
   Initially tried ./configure --enable-md5sum but this gave
 
 configure: WARNING: unrecognized options: --enable-md5sum

md5sum isn't a feature you can enable/disable (--enable-FEATURE). The help
output lists them all.

 
 and proceeded anyway.
 
   When `make' was run, many errors were reported, concerning expr.c (see
 next section).  Since these concern the expr command which was not 
 really
 needed, tried repeating ./configure using  --without-PACKAGE
 ./configure --without-expr

expr isn't a package you can with/without (--without-PACKAGE). The help
output lists them all.

It's great that after all this trouble you've held on to your optimistic
belief that there must be some way of configuring just the program you want,
if only you can just find the right syntax. Sadly, it just isn't so.

Ideally, make src/md5sum or make -C src md5sum would work, but the
Makefiles in coreutils aren't quite good enough for it to work out of the
box. Some dependencies are missing. But if your first attempt got far enough
to blow up on expr.c, make -C src md5sum might actually work afterward.

 3. Some problems with configure
 
   Retried make, redirecting the output to a log file.  The errors in expr
 were more extensive than realised before.  The first error is
 expr.c:54:18: error: gmp.h: No such file or directory
 
   It seems that configure has made an incorrect decision about the
 availability of gmp, which is not available (but is placed ready to be 
 installed along with the gcc sources. It had previously been established
 that it was a Prerequisite).
 
   Noted that config.status has
 D[HAVE_GMP]= 1

It sounds like configure found your not-yet-installed gmp and tried to use
it, with disastrous results. This is the part of the bug report where you
should include your config.log, so we can see exactly how that HAVE_GMP
became 1. And I won't be surprised if it turns out to be a bug that's already
fixed in 8.21.

 and the expr.c source tests this.  It seems that configure has
 incorrectly decided that gmp is available, and expr.c fails to find 
 the header,
 and all other errors arise from this.
 
   Since the expr.c source allowed for the test failing, it seemed 
 possible
 to proceed without gmp.  So config.status was modified so that
 D[HAVE_GMP]= 0

Editing a config.status by hand? That sure shows bravery and determination.

I'm quitting here. The rest of the story needs to be read by someone who
actually knows MacOS.

-- 
Alan Curry

86 matches

Mail list logo