Re: bsearch utility

2005-07-15 Thread Jim Meyering
Paul Eggert [EMAIL PROTECTED] wrote:
 I like the idea of implementing look in coreutils, and doing it
 right.  look was in Unix Version 7, and it is a handy utility.
...
 I'd prefer that coreutils look have long options that are consistent
 with sort, even if its short options are different for historical
 reasons.

I too like the idea -- and agree about long options.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: new coreutil? shuffle - randomize file contents

2005-07-15 Thread Jim Meyering
Frederik Eaton [EMAIL PROTECTED] wrote:
 Attached is a second patch, which contains a ChangeLog entry and some
 formatting changes as requested by Jim.

Can you update your patch to be relative to coreutils-CVS,
  http://savannah.gnu.org/cvs/?group=coreutils
rather than to the aging 5.2.1?

Also, formatting should adhere to the gnu coding standards.
A good way to start is by using GNU indent with the default
settings.

shred also tries to obtain a random seed.
It'd be nice (eventually) to have both programs
use the same mechanism.

Please use FIXME rather than XXX to mark bits of
code that need more attention.

Regarding this:
   error (SORT_FAILURE, 0, _(%s: invalid field specification `%s'),
_(msgid), spec);
-  abort ();
+  abort (); // XXX is this ever reached? need comment if it is

That code is never reached, because the preceding error
call invokes `exit (SORT_FAILURE)'.  The abort is to pacify
gcc -Wall.

Regarding this:
+  char *randseed = 0;
please use `NULL', instead:
  char *randseed = NULL;

I've just noticed that this would make sort
use a file in $HOME (~/.gnupg/entropy).  That
dependency on $HOME would be a first for coreutils.
I'm reluctant to pull in all of that EGD-related code
just to get a default seed in some environments.
Do any of you have a feel for how important (or how
widely available) EGD support is in practice?


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: bsearch utility

2005-07-15 Thread Sorav Bansal

Jim Meyering wrote:


I too like the idea -- and agree about long options.
 

Great. I would like to send you my implementation for code review. I 
would be able to send you the implementation by sometime next week.


Sorav


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: new coreutil? shuffle - randomize file contents

2005-07-15 Thread Paul Eggert
Thanks for working on this.  You've gotten further than anyone else
has!  Some quick comments:

Frederik Eaton [EMAIL PROTECTED] writes:

 Is there a script for making a patch with all the right files excluded
 by the way?

Not yet.  That's on the list of things to do.  The fix will be to remove
those files from the repository, and have a bootstrap script that generates
them.  Then cvs diff will do the right thing.

 shred also tries to obtain a random seed.
 It'd be nice (eventually) to have both programs
 use the same mechanism.

 I did not realize this. Yes, perhaps it would.

That sounds right to me as well.

 I do not have any idea, but it would be good to have random seed
 functionality somewhere standard-ish. It would be nice if it were in
 libc but that's an area I don't want to tackle.

It's a reasonable thing to put in coreutils/lib.  We can then put it
into gnulib.

 Also, I didn't personally think it was necessary to do anything but
 look at the time and process ID for a default seed - I only added the
 /dev/random stuff because Paul Eggert seems to think that security
 is very important here

If sort -R is used for anything security-relevant, then it will be
important, yes.

 - and once /dev/random is referenced then I figured compatibility
 with other kinds of systems would become an issue

No, the method used by shred is good enough.  We don't need to worry
about EGD any more.  It's an obsolete hack.  You might be better off
starting with the code in shred.

 I would like not to spend much more time on this, by the way,

Alas, more work needs to be done.  Perhaps we can recruit someone else
to look into it?  I will put it on my list of things to do as well,
but it's a pretty long list

+   * src/sort.c: add functions init_salt, get_hash; 'init_salt' is
+   used to provide seeding, and 'get_hash' calculates the hash value
+   by which a key is sorted

Please use the ChangeLog style recommended by the GNU coding standards
http://www.gnu.org/prep/standards/html_node/Change-Logs.html.
For example, the first line should start with
* src/sort.c (init_salt, get_hash):.
It's important to log every externally-visible identifier whose meaning
has changed.

Also, each external symbol (function, macro, variable) should have a comment
explaining what it does.  Currently I'm at a bit of a loss trying to
figure out what things do, so my comments will be limited.

+#ifndef _CHECKSUM_H
+#define _CHECKSUM_H 1
+
+#include sys/types.h
+#include config.h
+#include sha1.h
+#include md5.h

This seems overkill to me.  sort is just using md5, right?

+  int len = strlen(str);

There should be a space before the (.  There are several other instances
of that.

+  return (len*2) = ops-bits;

Please put spaces around the *.  The parentheses aren't needed here.

Also, we prefer = to =.  This is a programming style rule that I
learned from D. Val Schorre.  It's an application of Leibnitz's
criterion for notation: textual order should reflect numeric order.

+  --seeduse specified seed for random number generator\n\

--seed has an operand, so it should be mentioned here.

+  void *ctx = alloca(digops.ctx_size);

What are the bounds on digops.ctx_size?  If it can be large (more than
a few kilobytes) then we shouldn't rely on alloca, due to stack-overflow
detection issues.

+  else if (key-random_hash)
+   {
+ int dig_bytes = digops.bits/8;
+ char diga[dig_bytes];
+ char digb[dig_bytes];
+ get_hash(texta, lena, diga);
+ get_hash(textb, lenb, digb);
+ diff = memcmp(diga, digb, sizeof(diga));
+   }

It should be possible to combine -R with -b, -d, -f, -g, -i, -M, -n,
and -r.  For example, sort -nR should compute the same hash for 1.0
and 01.0 that it computes for 1.00.  Currently, though, the -R is
silently ignored in this case.  Conversely, sort -MR should compute
the same hash for  Jan that it does for Jan; but here, the -M is
silently ignored.


   else if (key-month)
diff = getmonth (texta, lena) - getmonth (textb, lenb);
   /* Sorting like this may become slow, so in a simple locale the user

@@ -1986,7 +2038,7 @@ badfieldspec (char const *spec, char con
 {
   error (SORT_FAILURE, 0, _(%s: invalid field specification %s),
 _(msgid), quote (spec));
-  abort ();
+  abort (); // inserted to avoid a compiler warning
 }

Please don't use //-style comments, as we can't assume C99 yet.
Also, please prefer declarative sentences in comments, and put
the comments on separate lines before the code they describe,
e,g.:

  /* Avoid a compiler warning.  */
  abort ();


+  char *randseed = 0;

This should be type char const *, not char *.  Note that we prefer
the const after the char.  Also, the 0 should be NULL.
 
+ randseed = strdupa(optarg);

We can't assume strdupa exists.  But you don't need to copy optarg; just
use it without copying it.

+  -R, --randomsort by random hash of keys\n\

I 

Re: new coreutil? shuffle - randomize file contents

2005-07-15 Thread Frederik Eaton
Hi,

Attached is a third patch.

Is there a script for making a patch with all the right files excluded
by the way? cvs diff produces a huge amount of unrelated output
because of files that are both in the repository and touched by
configure, and it doesn't list new files. And diff doesn't seem to
have an --include option to match its --exclude...

  Attached is a second patch, which contains a ChangeLog entry and some
  formatting changes as requested by Jim.
 
 Can you update your patch to be relative to coreutils-CVS,
   http://savannah.gnu.org/cvs/?group=coreutils
 rather than to the aging 5.2.1?

Done.

 Also, formatting should adhere to the gnu coding standards.
 A good way to start is by using GNU indent with the default
 settings.

I did this for some things. Maybe if you see other problems you could
run GNU indent before running 'cvs commit'?

 shred also tries to obtain a random seed.
 It'd be nice (eventually) to have both programs
 use the same mechanism.

I did not realize this. Yes, perhaps it would.

 Please use FIXME rather than XXX to mark bits of
 code that need more attention.
 
 Regarding this:
error (SORT_FAILURE, 0, _(%s: invalid field specification `%s'),
 _(msgid), spec);
 -  abort ();
 +  abort (); // XXX is this ever reached? need comment if it is
 
 That code is never reached, because the preceding error
 call invokes `exit (SORT_FAILURE)'.  The abort is to pacify
 gcc -Wall.
 
 Regarding this:
 +  char *randseed = 0;
 please use `NULL', instead:
   char *randseed = NULL;
 
 I've just noticed that this would make sort
 use a file in $HOME (~/.gnupg/entropy).  That
 dependency on $HOME would be a first for coreutils.
 I'm reluctant to pull in all of that EGD-related code
 just to get a default seed in some environments.
 Do any of you have a feel for how important (or how
 widely available) EGD support is in practice?

I do not have any idea, but it would be good to have random seed
functionality somewhere standard-ish. It would be nice if it were in
libc but that's an area I don't want to tackle.

Also, I didn't personally think it was necessary to do anything but
look at the time and process ID for a default seed - I only added the
/dev/random stuff because Paul Eggert seems to think that security
is very important here - and once /dev/random is referenced then I
figured compatibility with other kinds of systems would become an
issue so I just copied the whole bit from libgcrypt. If you want to
change that stuff, go ahead.

I would like not to spend much more time on this, by the way, it's
been about 7 months now, I would like for us to try to at least figure
out what should be in a preliminary version, make sure that the
command line interface is suitable, etc., somewhat soon. For instance,
if the random stuff stays, then I'll need some guidance on what is the
best way to update configure.ac (unless you want to do it). Also, I'm
not sure what kind of documentation will need to be updated. I was
hoping for more of this kind of feedback, so we can minimize the
number of email back-and-forths.

Thanks,

Frederik
diff --exclude CVS --exclude '*.in' --exclude configure --exclude '*.1' 
--exclude '*.info' --exclude '*~' --exclude '*.m4' --exclude 'autom4te*' 
--exclude config.hin --exclude getdate.c --exclude stamp-vti --exclude '*.texi' 
--exclude dircolors.h --exclude wheel.h --exclude tests --exclude wheel-size.h 
--exclude false.c -ruNp coreutils-cvs/ChangeLog coreutils-cvs-modified/ChangeLog
--- coreutils-cvs/ChangeLog 2005-07-14 01:03:08.0 +0100
+++ coreutils-cvs-modified/ChangeLog2005-07-15 14:38:25.0 +0100
@@ -1,3 +1,17 @@
+2005-07-14  Frederik Eaton [EMAIL PROTECTED]
+
+   Add --random, --seed options to 'sort' to get shuffle-like
+   functionality.
+
+   * src/sort.c: add functions init_salt, get_hash; 'init_salt' is
+   used to provide seeding, and 'get_hash' calculates the hash value
+   by which a key is sorted
+   * src/checksum.h: stuff to make it possible to switch between
+   different hash algorithms at runtime
+   * src/randseed.h:
+   * src/randseed.c: read a fixed-length random seed from entropy
+   devices. adapted from libgcrypt
+
 2005-07-13  Paul Eggert  [EMAIL PROTECTED]
 
* Version 5.3.1.
diff --exclude CVS --exclude '*.in' --exclude configure --exclude '*.1' 
--exclude '*.info' --exclude '*~' --exclude '*.m4' --exclude 'autom4te*' 
--exclude config.hin --exclude getdate.c --exclude stamp-vti --exclude '*.texi' 
--exclude dircolors.h --exclude wheel.h --exclude tests --exclude wheel-size.h 
--exclude false.c -ruNp coreutils-cvs/src/checksum.h 
coreutils-cvs-modified/src/checksum.h
--- coreutils-cvs/src/checksum.h2004-09-22 21:11:10.0 +0100
+++ coreutils-cvs-modified/src/checksum.h   2005-07-15 15:17:44.0 
+0100
@@ -1,3 +1,11 @@
+#ifndef _CHECKSUM_H
+#define _CHECKSUM_H 1
+
+#include sys/types.h
+#include config.h
+#include sha1.h
+#include 

patched nohup in response to today's Austin Group minutes

2005-07-15 Thread Paul Eggert
I installed this:

* doc/coreutils.texi (nohup invocation): POSIXLY_CORRECT no longer
affects nohup's behavior.  Input is redirected from /dev/null.
* src/nohup.c (main): Don't worry about POSIXLY_CORRECT.  Today's
Austin Group Minutes says that the GNU behavior will be put
forward as proposed text for a future revision.

Index: NEWS
===
RCS file: /fetish/cu/NEWS,v
retrieving revision 1.301
diff -p -u -r1.301 NEWS
--- NEWS11 Jul 2005 18:20:05 -  1.301
+++ NEWS15 Jul 2005 21:53:01 -
@@ -186,9 +186,8 @@ GNU coreutils NEWS  
   ls no longer outputs an extra space between the mode and the link count
   when none of the listed files has an ACL.
 
-  If stdin is a terminal, nohup now closes it and then reopens it with an
-  unreadable file descriptor.  (This step is skipped if POSIXLY_CORRECT is 
set.)
-  This prevents the command from tying up an OpenSSH session after you logout.
+  If stdin is a terminal, nohup now redirects it from /dev/null to
+  prevent the command from tying up an OpenSSH session after you logout.
 
   stat -f -c %S outputs the fundamental block size (used for block counts).
   stat -f's default output format has been changed to output this size as well.
Index: doc/coreutils.texi
===
RCS file: /fetish/cu/doc/coreutils.texi,v
retrieving revision 1.271
diff -p -u -r1.271 coreutils.texi
--- doc/coreutils.texi  11 Jul 2005 18:20:34 -  1.271
+++ doc/coreutils.texi  15 Jul 2005 21:53:02 -
@@ -12615,6 +12615,13 @@ out.  Synopsis:
 nohup @var{command} [EMAIL PROTECTED]@dots{}
 @end example
 
+If standard input is a terminal, it is redirected from
[EMAIL PROTECTED]/dev/null} so that terminal sessions do not mistakenly consider
+the terminal to be used by the command.  This is a @acronym{GNU}
+extension; programs intended to be portable to [EMAIL PROTECTED] hosts
+should use @samp{nohup @var{command} [EMAIL PROTECTED]@dots{} /dev/null}
+instead.
+
 @flindex nohup.out
 If standard output is a terminal, the command's standard output is appended
 to the file @file{nohup.out}; if that cannot be written to, it is appended
@@ -12627,14 +12634,6 @@ regardless of the current umask settings
 If standard error is a terminal, it is redirected to the same file
 descriptor as the (possibly-redirected) standard output.
 
[EMAIL PROTECTED] POSIXLY_CORRECT
-If standard input is a terminal, it is closed so that terminal
-sessions do not mistakenly consider the terminal to be used by the
-command.  To avoid glitches in poorly-written programs standard input
-is then reopened with an innocuous file descriptor that cannot be read
-from.  However, these steps are skipped if @env{POSIXLY_CORRECT} is
-set since @acronym{POSIX} requires standard input to be left alone.
-
 @command{nohup} does not automatically put the command it runs in the
 background; you must do that explicitly, by ending the command line
 with an @samp{}.  Also, @command{nohup} does not change the
Index: src/nohup.c
===
RCS file: /fetish/cu/src/nohup.c,v
retrieving revision 1.31
diff -p -u -r1.31 nohup.c
--- src/nohup.c 3 Jul 2005 07:18:48 -   1.31
+++ src/nohup.c 15 Jul 2005 21:53:02 -
@@ -97,12 +97,8 @@ main (int argc, char **argv)
   usage (NOHUP_FAILURE);
 }
 
-  /* If standard input is a tty, replace it with a file descriptor
- that exists but gives you an error if you try to read it.  POSIX
- requires nohup to leave standard input alone, but that's less
- useful in practice as it causes a nohup foo  exit session to
- hang with OpenSSH.  */
-  if (!getenv (POSIXLY_CORRECT)  isatty (STDIN_FILENO))
+  /* If standard input is a tty, replace it with /dev/null.  */
+  if (isatty (STDIN_FILENO))
 fd_reopen (STDIN_FILENO, /dev/null, O_WRONLY, 0);
 
   /* If standard output is a tty, redirect it (appending) to a file.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: more gcc warnings

2005-07-15 Thread Paul Eggert
Eric Blake [EMAIL PROTECTED] writes:

 If all that changed was the addition or subtraction of O_APPEND,

For coreutils we don't need to worry about this.  We can assume that
if freopen (NULL, ...) is being called, then the call is either
freopen (NULL, rb, stdin) or freopen (NULL, wb, stdout).

 On cygwin, fcntl(F_SETFL) currently doesn't change binary vs. text,
 so that requires setmode(O_BINARY) if the mode included 'b', and it
 would be easy to add our !isatty filter here.

Good, that makes it sound easy.

 But because we do not know whether mode w would have opened the
 current file in binary or text mode,

Surely we don't need to worry about that.  We can simply invoke
setmode to change the mode to O_BINARY if the file is not a terminal.

 if (O_BINARY)
   {
 FILE *tmp = freopen (NULL, rb, stdin);
 if (tmp) stdin = tmp;
   }

No, that won't work: portable code can't assign to stdin.  But I don't
think we need to worry about this if we adopt the solution mentioned
above.

 Even if newlib (the provider of freopen() for both cygwin and mingw)
 updates freopen() to actually implement file access changes where
 possible, we still need to deal with the fact that a failure will
 close the underlying file descriptor.

Is failure still possible under DOS, under the above assumptions?  If
not, then let's not worry about it.  Otherwise, is it OK if we simply
ignore the failure, and let later uses of the stream report an error?

 Hopefully we are safe so long as we limit ourselves to just use
 freopen(NULL, rb, stdin), freopen(NULL, wb, stdout), and
 freopen(non_null, any_mode, any_file), as is the current case in CVS.

Yes, that's the basic idea.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


autoreconf vs bootstrap script (was: new coreutil? shuffle - randomize file contents)

2005-07-15 Thread Bob Proulx
Paul Eggert wrote:
 Frederik Eaton writes:
  Is there a script for making a patch with all the right files excluded
  by the way?
 
 Not yet.  That's on the list of things to do.  The fix will be to remove
 those files from the repository, and have a bootstrap script that generates
 them.  Then cvs diff will do the right thing.

Doesn't 'autoreconf --install' do everything that one would want in a
bootstrap script these days?  And if not shouldn't it?

Bob ...who thinks that bootstrap scripts are obsolete now...


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: bsearch utility

2005-07-15 Thread Andrew D Jewell
I realize I'm nobody, but I would really like to see look(1) have 
exactly the same -k options as sort().


And because the sort keys can be over multiple columns, look() would 
need some way to specify more than one column worth of sort key.


Just sharing my thoughts,
adj


At 11:59 PM -0700 7/14/05, Sorav Bansal wrote:

Jim Meyering wrote:


I too like the idea -- and agree about long options.

Great. I would like to send you my implementation for code review. I 
would be able to send you the implementation by sometime next week.


Sorav



___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


chmod RFE

2005-07-15 Thread Dave Yost
The chmod on Mac OS X (which is probably same as one of the BSDs) has 
better functionality and a much better man page.  GNU should catch up.


Dave


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils