Re: Feature request for head command

Pádraig Brady Wed, 21 Jun 2006 02:07:18 -0700

Robert McKay wrote:
> Synopsis of the problem:
> 
> When you use head to read a certain number of lines out of a pipe
> sometimes it eats more data than you ask it to,
> 
> For example:
> 
> [EMAIL PROTECTED] ~]$ ( echo  hello;  echo world; ) | (  head -1 >
> /dev/null; cat)
> [EMAIL PROTECTED] ~]$ ( echo  hello; sleep 1;  echo world; ) | (  head -1
> 
>> /dev/null; cat)
> 
> world
> 
> So.. why does the first command not print anything but the second
> command prints out "world"?
> 
> Well in the first case hello and world were immediately available to
> be read and head -1 read them both into it's buffer before discovering
> that actually it should have stopped at the first newline.
> 
> In the second instance the sleep 1 in the middle causes head's read()
> call to return early, (because no more data was immediately available
> to be read), and head realized that it already had a newline and
> exited, leaving the "world" for cat to read.
> 
> What to do about this? head can't unread data that it's already read
> so the only solution is for it to read the input one byte at a time -
> constantly checking for a \n (newline). Is this inefficient? Yes, of
> course it is but it would also be very useful. I'm not proposing that
> this be made the default behavior, simply that a new option be added
> to support it.
> 
> I eagerly await your flames :-)


Note head will work as expected if input is a file.
I.E. it does an lseek back over the data it didn't process.

I've documented stdin buffering issues here:
http://www.pixelbeat.org/programming/stdio_buffering/

There is a patch there that I'm thinking of
cleaning up and applying to all appropriate coreutils.

Pádraig.


_______________________________________________
Bug-coreutils mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Re: Feature request for head command

Reply via email to