I was trying today to filter my access.log apache log with some coreutils
and was annoyed by the default buffering applied by glibc.
I was trying to do `tail -f ~/access.log | cut ... | uniq` but I was
only getting output when cut had more than 4K written to stdout.

So how to control this? Well each app could add an extra config
parameter (see grep --line-buffered for example), but this doesn't
seem general, and just requires duplicating both logic and documentation
for each application. What would be ideal IMHO would be to
add the config logic in glibc (which would have to be controlled
with environment variables). There seems to be resitance to that though:
http://sources.redhat.com/ml/bug-glibc/1999-09/msg00041.html

Anyway whether it's implemented in libc or the application (coreutils lib),
I think they should have the same config interface which would
be environment variables with something like the following format:
    BUF_X_=Y
Where X = the fd number
and Y = 0 for unbuffered, 1 for line buffered and >1 for a specific
buffer size.

So for my particular problem I could do:

tail -f ~/access.log | BUF_1_=1 cut ... | uniq

Would this be a useful addition to coreutils lib, called from the
appropriate apps?
Or better again, could this possibly be added to glibc?

cheers,
Pádraig.

p.s. $ rpm -q fedora-release gcc glibc
fedora-release-4-2
gcc-4.0.0-8
glibc-2.3.5-10

p. p.s. Note glibc changes the buffering automatically for stdout only like:
    if (isatty(fileno(stdout)) setlinebuf(stdout)
Also it always leaves stdin buffered and stdout unbuffered.

p.p.s. setvbuf(stdin, (char*) NULL, _IOFBF, 12345) is not honoured,
which is fair enough as buf==NULL.
However 0 is returned indicating it was honoured?

p.p.p.s. reads on stdin default to 4096 bytes on my system
even though BUFSIZ is defined as 8912 ?


_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Reply via email to