/dev/fd/0 Considered Harmful
           Patrick ("No good deed goes unpunished") Powell
               <papowell _AT_ astart _dot_ com>

Preface:

    After writing the first draft of this article,  I started looking at various
    archives for discussion about /dev/fd/0 support in various operating systems.

    In the foomatic-developer mailing lists there was a discussion
    about the differences between using /dev/fd/0 and '-' as input
    file specifications for GhostScript and other programs.

    This issue strikes again. Read on and weep, gnash your teeth, or laugh
    at me as you find suitable.  And then look at the code that you have
    been writing and weep, gnash your teeth, or scream in rage that you have
    fallen into the trap of /dev/fd/0.

Introduction:

    I have just spent the last couple of days trying to discover
    why part of the LPRng printing system died after an upgrade to
    some of the components.  The cause was finally traced to the
    use of /dev/fd/0 for a pathname instead of the classic '-' for
    some utilities.  To say that this was a suprise was, shall we
    say,  a tremendous understatement, and led to exploring the
    depths of the source code of various programs, including
    GhostScript, a2ps, enscript, and others where a '-' argument
    for a pathname means 'read from fd 0' (i.e. - stdin).

What is the purpose of the /dev/fd/0 stuff?

    Lets look at a typical example,  say,  the 'file' utility.  It expects a
    command line such as:

     file /tmp/file

    It will open /tmp/file, read its contents, and determine the file
    type.  Sometimes you would like to determine the file type of the
    output of a pipe:

     perl run_script | file 

    But file,  bless its little heart,  may REQUIRE a path.  So, /dev/fd/0
    to the rescue:

     perl run_script | file  /dev/fd/0

    Through the magic of the Operation System, 
       fd = open(/dev/fd/0,...)
    will have the same effect (broadly speaking) as
       fd = dup(0)
    and fd will be a 'duplicate' of fd 0.

The Dark Side of /dev/fd/0

    On the surface, /dev/fd/0 appears to be harmless and a Good Idea.
    But lets look at another convention.  If no input files are
    specified, then input will be taken from STDIN (fd 0).  If there
    is a path specified,  then the file file will be opened and
    input read from it.  By convention,  '-' will ALSO stand for
    reading from stdin.

   /* Good and reasonable implementation but screws up when /dev/fd/0 passed */

   if( path && strcmp(path,"-") ){
       close(0);   /* covers the case where fd 0 is not open,  note the
                      lack of error checking,  which is deliberate here */
       if( (fd = open(path,....)) == -1 ){
         Die("cannot open '%s' - %s", path, strerror(errno) );
       } else if( fd ){
         Die("open '%s' returned FD %d", path, fd );
       }
   }

   Lets see what happens here.  FD 0 is closed.  The Operating
   System takes the usual actions associated with closing the file
   descriptor, probably updating the internal process structure.
   Then we open 'path'.  If, as we expect, opening /dev/fd/0 should
   try to open an 'unopenable' file,  the open should fail.  But
   for some programs it does not.  Why?

   /* Evil and poor implementation */
   if( path && strcmp(path,"-") ){
       close(0);
       open(path,....)
   }
   ...
   if( read(0,...) < 0 ){
       /* treat as EOF */
   }

   As you see, the brutal assumpation is made that the file open
   will always succeed, or if it fails, we get -1, which will case
   an -1 value to be returned when used with read().  You would
   suspect that nobody would write code like that, right?  Ummm...
   lets change the subject really fast.  And stop using grep
   to look at my code,  you suspicious person, you.

   Say you have a simple PostScript file:

   %!PS-Adobe-3.0
   /Courier findfont 200 scalefont setfont
   72 300 moveto
   (1) show showpage


    % gs --help
    AFPL Ghostscript 8.13 (2003-12-31)

    % gs /tmp/one.ps
    <you get page output>
    % gs /dev/fd/0 </tmp/one.ps
    AFPL Ghostscript 8.14 (2004-02-20)
    Copyright (C) 2004 artofcode LLC, Benicia, CA.  All rights reserved.
    This software comes with NO WARRANTY: see the file PUBLIC for details.
    Loading NimbusMonL-Regu font from 
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2267236 903936 1456484 168258 1 
done.
    >>showpage, press <return> to continue<<

    <you get no page output>
    % gs - </tmp/one.ps
    <you get page output>

    Umm... interesting.  Very interesting.  It appears that AFPL ghostscript
    has a /dev/fd/0 related problem.

    So lets try Gnu-GhostScript:

    # gs /tmp/one.ps
        GNU Ghostscript 7.07 (2003-05-17)
        Copyright (C) 2003 artofcode LLC, Benicia, CA.  All rights reserved.
        This software comes with NO WARRANTY: see the file PUBLIC for details.
        Loading NimbusMonL-Regu font from 
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2079048 716830 1622424 335985 0 
done.
        >>showpage, press <return> to continue<<

    <page displayed>

        GS>quit
        # gs - </tmp/one.ps
        GNU Ghostscript 7.07 (2003-05-17)
        Copyright (C) 2003 artofcode LLC, Benicia, CA.  All rights reserved.
        This software comes with NO WARRANTY: see the file PUBLIC for details.
        Loading NimbusMonL-Regu font from 
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2079048 716823 1622424 333651 0 
done.

    <NO page displayed>

        # gs /dev/fd/0 </tmp/one.ps
        GNU Ghostscript 7.07 (2003-05-17)
        Copyright (C) 2003 artofcode LLC, Benicia, CA.  All rights reserved.
        This software comes with NO WARRANTY: see the file PUBLIC for details.
        Loading NimbusMonL-Regu font from 
/usr/local/share/ghostscript/fonts/n022003l.pfb... 2079048 716830 1622424 335979 0 
done.
        >>showpage, press <return> to continue<<

    <NO page displayed>

    So it appears that this (and most likely previous) version(s) suffer from this 
problem.

What is the right way to do this?

   /* Pure in spirit and thinks only clean thoughts Method
    * This handles '-' and /dev/fd/0 but at the cost of having to do a bit
    * more work.  Note that this handles /dev/stdin as well.
    */
    if( path && strcmp(path,"-") ){
       if( (fd = open(path,....)) == -1 ){
         Die("cannot open '%s' - %s", path, strerror(errno) );
       } else if( fd ){
         Die("open '%s' returned FD %d", path, fd );
       }
        if( fd != 0 ){
           if( dup2(fd,0) == -1 ){
             Die("dup2 failed - %s", strerror(errno) );
           }
           if( close(fd) == -1 ){
             Die("close failed - %s", strerror(errno) );
           }
        }
    }

    This last example is the robust way of dealing with this problem.
    Note that you do the open() first, then you dup() the file descriptor.

Why did you close fd 0?

    The problem really starts when fd 0 is closed.  These problems could
    be avoided if fd 0 is never closed or is closed only after reading it.

    However,  this now opens a whole slew if of issues.  What if you have
      lpr /dev/fd/0 /dev/fd/0
    i.e. - you expect to open a path multiple time and get multiple copies
    of the document printed.

    The best answer to this is 'undefined behavior'.

    Perhaps avoiding using '-' is the best approach.

Why didn't you RTFM?

    Lets see what the man pages say:

    FD(4)                  FreeBSD Kernel Interfaces Manual                  FD(4)
    NAME
         fd, stdin, stdout, stderr -- file descriptor files
    DESCRIPTION
         The files /dev/fd/0 through /dev/fd/# refer to file descriptors which can
         be accessed through the file system.  If the file descriptor is open and
         the mode the file is being opened with is a subset of the mode of the
         existing descriptor, the call:
               fd = open("/dev/fd/0", mode);
         and the call:
               fd = fcntl(0, F_DUPFD, 0);
         are equivalent.

         Opening the files /dev/stdin, /dev/stdout and /dev/stderr is equivalent
         to the following calls:
               fd = fcntl(STDIN_FILENO,  F_DUPFD, 0);
               fd = fcntl(STDOUT_FILENO, F_DUPFD, 0);
               fd = fcntl(STDERR_FILENO, F_DUPFD, 0);
         Flags to the open(2) call other than O_RDONLY, O_WRONLY and O_RDWR are
         ignored.

    FILES
         /dev/fd/#
         /dev/stdin
         /dev/stdout
         /dev/stderr

    SEE ALSO
         tty(4)

    FCNTL(2)                  FreeBSD System Calls Manual                 FCNTL(2)
    NAME
         fcntl -- file control
    LIBRARY
         Standard C Library (libc, -lc)
    SYNOPSIS
         #include <fcntl.h>
         int
         fcntl(int fd, int cmd, ...);
    DESCRIPTION
         The fcntl() system call provides for control over descriptors.  The argu-
         ment fd is a descriptor to be operated on by cmd as described below.
         Depending on the value of cmd, fcntl() can take an additional third argu-
         ment int arg.
         F_DUPFD    Return a new descriptor as follows:
                        o   Lowest numbered available descriptor greater than or
                            equal to arg.
                        o   Same object references as the original descriptor.
                        o   New descriptor shares the same file offset if the
                            object was a file.
                        o   Same access mode (read, write or read/write).
                        o   Same file status flags (i.e., both file descriptors
                            share the same file status flags).
                        o   The close-on-exec flag associated with the new file
                            descriptor is set to remain open across execve(2) sys-
                            tem calls.

    So, it appears that when the 'open(/dev/fd/0)' is done, then since /dev/fd/0 is 
closed,
    you should get -1 returned,  and experimentally, you do:

 % cat opentest.c
     *
       test open on /dev/fd/0
     */
    #include <stdio.h>
    #include <unistd.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <errno.h>


    int main( int argc, char *argv[], char *envp[] )
    {
        int fd;
        close(0);
        fd = open("/dev/fd/0",O_RDONLY);
        if( fd == -1 ){
            fprintf(stderr,"open /dev/fd/0 failed - %s\n", strerror(errno) );
            exit(1);
        }
        return(0);
    }

 Linux and FreeBSD: 

 % make opentest
 % opentest </etc/hosts
 open /dev/fd/0 failed - Bad file descriptor

   So, this looks like the same behavior. 

   As for LINUX, here is the information from the 'man proc' (RedHat 9 release):

             fd     This is a subdirectory containing one entry for each file
                     which the process has open, named by its file descriptor,
                     and which is a symbolic link to the actual file  (as  the
                     exe  entry  does).  Thus, 0 is standard input, 1 standard
                     output, 2 standard error, etc.

                     Programs that will take a filename, but will not take the
                     standard  input,  and which write to a file, but will not
                     send their output to standard output, can be  effectively
                     foiled this way, assuming that -i is the flag designating
                     an input file and -o is the flag  designating  an  output
                     file:
                     foobar -i /proc/self/fd/0 -o /proc/self/fd/1 ...
>>                   and  you  have a working filter.  Note that this will not
>>                   work for programs that seek on their files, as the  files
>>                   in the fd directory are not seekable.

                     /proc/self/fd/N is approximately the same as /dev/fd/N in
                     some UNIX and  UNIX-like  systems.   Most  Linux  MAKEDEV
                     scripts  symbolically  link  /dev/fd to /proc/self/fd, in
                     fact.


   OK, so seek should fail.  Right?  Says so in the documentation.  So, unbeliever that
   I am,  I will try an experiment.

 %cat seektest.c:
    /*
       test seek on /dev/fd/0
     */
    #include <stdio.h>
    #include <unistd.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <errno.h>


    int main( int argc, char *argv[], char *envp[] )
    {
        int fd;
        fd = open("/dev/fd/0",O_RDONLY);
        if( fd == -1 ){
            fprintf(stderr,"open /dev/fd/0 failed - %s\n", strerror(errno) );
            exit(1);
        }
        if( lseek(fd,0,SEEK_SET) == (off_t)(-1) ){
            fprintf(stderr,"lseek /dev/fd/0 failed - %s\n", strerror(errno) );
            exit(1);
        }
        fprintf(stderr,"lseek /dev/fd/0 succeeded\n" );
        return(0);
    }


 % make seektest
 % cat /etc/hosts |seektest
 lseek /dev/fd/0 failed - Illegal seek
 % seektest </etc/hosts
 lseek /dev/fd/0 succeeded

     We try this on FreeBSD and we get:

 % make seektest
 % cat /etc/hosts |seektest
 lseek /dev/fd/0 failed - Illegal seek
 % seektest </etc/hosts
 lseek /dev/fd/0 succeeded

   So at least the FreeBSD and Linux systems agree in this behavior.


 Ummm... so much for believing the documentation.

 Even when you RTFM the FM may not be correct.

Summary:

 Many of the existing utilities that expect to have input on FD 0 (stdin)
 and are passed /dev/fd/0 as a file parameter appear to fail or have 0 length
 input.  This appears to be caused by the application closing fd 0 and then
 opening /dev/fd/0.

 Avoid the use of /dev/fd/0 as a command line path unless you are sure that the
 implementors of the software will open and close the file descriptors correctly.

 Also, perhaps avoiding the use of '-' and /dev/fd/0 and sticking to the default
 for reading from stdin is the best method.

Patrick Powell                 Astart Technologies
[EMAIL PROTECTED]            6741 Convoy Court
Network and System             San Diego, CA 92111
  Consulting                   858-874-6543 FAX 858-751-2435
LPRng - Print Spooler (http://www.lprng.com)

-----------------------------------------------------------------------------
YOU MUST BE A LIST MEMBER IN ORDER TO POST TO THE LPRng MAILING LIST
The address you post from or your Reply-To address MUST be your
subscription address

If you need help, send email to [EMAIL PROTECTED] (or lprng-requests
or lprng-digest-requests) with the word 'help' in the body.
To subscribe to a list with name LIST,  send mail to [EMAIL PROTECTED]
with:                           | example:
subscribe LIST <mailaddr>       |  subscribe lprng-digest [EMAIL PROTECTED]
unsubscribe LIST <mailaddr>     |  unsubscribe lprng [EMAIL PROTECTED]

If you have major problems,  call Patrick Powell or one of the friendly
staff at Astart Technologies for help.  Astart also does support for LPRng.
Also, check the Web Page at: http://www.lprng.com for any announcements.
Astart Technologies  (LPRng - Print Spooler http://www.lprng.com)
6741 Convoy Court
San Diego, CA 92111
858-874-6543 FAX 858-751-2435
-----------------------------------------------------------------------------

Reply via email to