Re: [UM-LINUX] bash redirects

Michael Henry Thu, 05 Oct 2006 19:50:50 -0700

On Oct 4, 2006, at 12:41 AM, J. Milgram wrote:

Also why the pipe-to-cat approach works ... the second redirection
applies to the second command (cat) and doesn't overwrite the stdout
redirection in the first (echo) ... (?)


echo foo 1>&2 | cat > /dev/null


Yes, the redirection of cat's handle 1 impacts only the cat
command.  But see below for more explanation about piping.

In this following example I wrote, my lack of time for
proofreading caught me.  See the correction below:

But pipe redirections occur first, so
in this example:

     echo foo 1>&2 | cat > /dev/null

the table for the `echo` command starts like this:

     0 --> IN (from terminal)
     1 --> pipe to cat
     2 --> ERR (to terminal)

Then the `1>&2` redirection takes place and you get:

     0 --> IN (from terminal)
     1 --> pipe to cat
     2 --> ERR (to terminal)


This should be:

    0 --> IN (from terminal)
    1 --> ERR (to terminal)
    2 --> ERR (to terminal)

(I'd forgotten to change handle 1, but it doesn't
sound like it confused you.)

I think the following explanation might make things
more clear.

Before a process is spawned, the shell has to setup
three file handles for the process: STDIN (handle 0),
STDOUT (handle 1), and STDERR (handle 2).  When there
are no redirections, these handles are typically
connected to the terminal, but this is only because
the shell itself was spawned with its own STDIN, STDOUT,
and STDERR connected to the terminal.  You can see this,
for example, by using the `-c` option to spawn a new
shell and run a command.  If you run this:

    bash -c 'cat'

Then a new instance of bash will be spawned, and it will
execute the command `cat`.  The cat command will inherit
the definition of STDIN that the new bash instance uses
(which it, in turn, inherited from your interactive
shell where you typed the above command).  The above
command will sit waiting for input from the terminal,
so you can type "hello" and a control-D (for end-of-file)
to provide input to `cat` like this:

    [EMAIL PROTECTED]:~/tmp/redirect$ bash -c 'cat'
    hello
    hello
    [EMAIL PROTECTED]:~/tmp/redirect$

Note: the first `hello` was what I typed, and the second one
comes from the output of `cat`.

Now if you redirect the STDIN of your `bash -c`, you change
what the `cat` command will inherit:

    [EMAIL PROTECTED]:~/tmp/redirect$ cat input.txt
    This is some input
    [EMAIL PROTECTED]:~/tmp/redirect$ bash -c 'cat' < input.txt
    This is some input
    [EMAIL PROTECTED]:~/tmp/redirect$

Notice that in this case, the `cat` command was connected
to input.txt because it inherited this definition for
the STDIN handle from the `bash -c`.

For a single process invoked at the shell prompt, Bash
allows its own definitions for STDIN, STDOUT, and STDERR
to be inherited by the new process.  This happens before
any redirections are considered, so the table of handles
for the new process looks like this:

    0 ==> bash_STDIN
    1 ==> bash_STDOUT
    2 ==> bash_STDERR

Where, as mentioned before, `bash_STDIN`, `bash_STDOUT`, and
`bash_STDERR` are typically file handles opened to the terminal.
Redirections on the command line are then processed left-to-right,
and the table is modified accordingly.  For the following
command:

    echo foo > output.txt

The table is modified by the `> output.txt` to change handle 1
to be an open handle to the file `output.txt`:

    0 ==> bash_STDIN
    1 ==> open("output.txt", "w")
    2 ==> bash_STDERR

Since the `echo` command always writes to handle 1, its output gets
written into the `output.txt` file.

The file redirection operator, `>`, redirects handle 1 by default.
To make it redirect another handle, the handle number is prepended.
So `1> output.txt` is the same as the plain `> output.txt`.  To
redirect STDERR for the `echo` command, you'd use `2> output.txt`.
Thinking of `>` as a shortcut for `1>` helps me remember the
syntax for the more complicated redirections discussed below.

To redirect a file handle to point to the contents of another
existing file handle, the syntax `n>&m` is used, where the
handle `n` will be assigned the current value of the handle
`m`.  So when you do the following:

    echo foo 1>&2

The handle table changes from the original:

    0 ==> bash_STDIN
    1 ==> bash_STDOUT
    2 ==> bash_STDERR

to this:

    0 ==> bash_STDIN
    1 ==> bash_STDERR
    2 ==> bash_STDERR

Handle 1 now points to `bash_STDERR`, and nothing points
to `bash_STDOUT` anymore.

I like to think of the redirections as having two parts.
The first part (`n>`) is like `n =`, because an assignment
is being made to handle `n`.  The second part (`&m` or `output.txt`)
is the value being assigned.  For plain files, the shell simply
opens the file and assigns the handle to `n`; for the `&m` case,
it's the current value of handle `m` that's assigned to `n`.
I like to think of it like the C address-of operator `&`, and
that it's getting what handle `m` points to (some open file)
and assigning it to `n`.  It's just a weird thing that helps
me remember.

In this example:

    echo foo 1>&2 > /dev/null

The handle table progresses as follows.

Initially:

    0 ==> bash_STDIN
    1 ==> bash_STDOUT
    2 ==> bash_STDERR

after `1>&2`:

    0 ==> bash_STDIN
    1 ==> bash_STDERR
    2 ==> bash_STDERR

after `> /dev/null`:

  0 ==> bash_STDIN
  1 ==> open("/dev/null", "w")
  2 ==> bash_STDERR

The original `bash_STDOUT` is still lost, and the `echo`
command's STDOUT handle is connected to `/dev/null`,
squelching the output.

Pipes add a new dimension to the problem.  For a pipeline
like the following:

    P1 | P2 | ... | Pn

The processes are connected together such that for consecutive pairs
`Pi` and `Pj`, the STDOUT of `Pi` is connected through a pipe (man 2
pipe) to the STDIN of `Pj`. Suppose `Pipe_i_j_write` is the write
handle of the pipe between `Pi` and `Pj`, and `Pipe_i_j_read` is the
read handle.  Bash sets up the initial file handles for these
processes *before considering other file redirections*.  This
means that the tables look like this:

Process P1 (before redirections):

    0 ==> bash_STDIN
    1 ==> Pipe_1_2_write
    2 ==> bash_STDERR

Process P2 (before redirections):

    0 ==> Pipe_1_2_read
    1 ==> Pipe_2_3_write
    2 ==> bash_STDERR

...

Process Pn (before redirections):

    0 ==> Pipe_(n-1)_n_read
    1 ==> bash_STDOUT
    2 ==> bash_STDERR

All of the pipes are connected first, then the redirections
come into play.  So if `P1` were `echo > /dev/null`,
the table for `P1` would change to:

Process P1 (before redirections):

    0 ==> bash_STDIN
    1 ==> open("/dev/null", "w")
    2 ==> bash_STDERR

Note that the connection to the pipe between `P1` and `P2`
is lost.

The pipeline always connects handle 1 to handle 0.  If you
want to pipe handle 2 to the next process, you need to
perform a redirection.

Using Rob's `t.sh` script as a test:

    [EMAIL PROTECTED] ~]$ cat t.sh
    #!/bin/sh

    echo This is stderr >&2
    echo This is stdout

We can construct a pipeline that passes everything written to handle 2
through the pipeline to `cat -n` (just to have something that munges
the data):

    [EMAIL PROTECTED]:~/tmp/redirect$ ./t.sh 2>&1 | cat -n
         1  This is stderr
         2  This is stdout
    [EMAIL PROTECTED]:~/tmp/redirect$

Notice that both "This is stderr" and "This is stdout"
make it to the `cat -n` process because they now have
line numbers.  The handle map for the `t.sh` invocation
starts as this:

    0 ==> bash_STDIN
    1 ==> Pipe_1_2_write (connects to `cat -n`)
    2 ==> bash_STDERR

then handle 2 is redirected to the same place handle 1 goes:

    0 ==> bash_STDIN
    1 ==> Pipe_1_2_write (connects to `cat -n`)
    2 ==> Pipe_1_2_write (connects to `cat -n`)

If you don't want to pass handle 1 output from `t.sh`, you
can redirect handle 1 to `/dev/null`:

    [EMAIL PROTECTED]:~/tmp/redirect$ ./t.sh 2>&1 > /dev/null | cat -n
         1  This is stderr
    [EMAIL PROTECTED]:~/tmp/redirect$

Now, only handle 2 gets through the pipe.  Notice that order
is important.  If the redirections are swapped, nothing gets
out:

    [EMAIL PROTECTED]:~/tmp/redirect$ ./t.sh > /dev/null 2>&1  | cat -n
    [EMAIL PROTECTED]:~/tmp/redirect$

Trace through the handle redirections to see why.

Initial table:

    0 ==> bash_STDIN
    1 ==> Pipe_1_2_write (connects to `cat -n`)
    2 ==> bash_STDERR

after `> /dev/null`:

    0 ==> bash_STDIN
    1 ==> open("/dev/null", "w")
    2 ==> bash_STDERR

after `1>&2`:

    0 ==> bash_STDIN
    1 ==> open("/dev/null", "w")
    2 ==> open("/dev/null", "w")

Both handles end up pointing to the bit bucket.

Hopefully this makes things a little clearer.  Sorry for
the long ramble - I just couldn't decide where to stop :-)

Michael Henry

Re: [UM-LINUX] bash redirects

Reply via email to