Re: Head too greedy

2007-01-02 Thread Pádraig Brady
John Summerfield wrote:
 [EMAIL PROTECTED] ~]$ yes stuff | head -300 | cat -n | (head -2;tail -2)
  1  stuff
  2  stuff
 [EMAIL PROTECTED] ~]$ 
 
 I presume this arises because head's reading ahead (if not head, then
 glibc on head's behalf), and when head's printed enough lines it simply
 closes its files (or maybe just exit()s.

yep.

 I don't see when this behaviour might actually be desired. I'd like to
 see its behaviour changed so that head consumes no more lines than it
 will report. (I note the man page is silent on what should happen, no
 surprise there).

It works as you expect only for seekable (rewindable) file descriptors.

 If you think that the current unpredictable behaviour is sometimes
 desirable, then could we please have something, maybe --nobuffer, to
 turn it off?

Have a look at the stdio input buffering problems here:
http://www.pixelbeat.org/programming/stdio_buffering/

Note the referenced patch on that page will have no affect
with head, as it doesn't use stdio to read data.

Pádraig.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: Head too greedy

2007-01-01 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to John Summerfield on 1/1/2007 5:58 AM:
 In this example, the results are what I desire. However,
 [EMAIL PROTECTED] ~]$ yes stuff | head -300 | cat -n | (head -2;tail -2)
  1  stuff
  2  stuff
 [EMAIL PROTECTED] ~]$ 
 
 I presume this arises because head's reading ahead (if not head, then
 glibc on head's behalf), and when head's printed enough lines it simply
 closes its files (or maybe just exit()s.

This behavior is allowed by POSIX.  If stdin is seekable, POSIX requires
that head (and any other utility that consumes partial input) rewind to
the last character consumed before exiting; but when stdin is a pipe,
there is no guarantee, and even an explicit statement that compliant
programs must not rely on the amount of data read ahead by the first
program in the pipeline.  Programs are more efficient with buffering, so
we do not want to change head to turn buffering off; and there are just to
many programs that could be told to disable buffering to add something
like --unbuffered to each of them.  That said, there has been a proposal
on this list in the past to write a new wrapper utility (somewhat like
nohup or nice are wrapper utilities) that disables buffering of the
std{in,out} of the wrapped program.  If someone were to write such a
program, it would be the perfect candidate for use in the scenario you
described.

- --
Don't work too hard, make some time for fun as well!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFmand84KuGfSFAYARArV2AKCAXvEsOTDgU6/oWj5yFdqJRLdPfgCg1uiI
kjXzDj6Vy/lI2ZriVQMejjQ=
=x/4A
-END PGP SIGNATURE-


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils