On Tue, Feb 13, 2024 at 09:35:11AM -0600, David Wright wrote:
> On Tue 13 Feb 2024 at 07:15:48 (-0500), Greg Wooledge wrote:
> > On Mon, Feb 12, 2024 at 11:01:47PM -0600, David Wright wrote:
> > > … but not much. For me, "standard output" is /dev/fd/1, yet it seems
> > > unlikely that anyone is going to use >&1 in the manner of the example.
> > 
> > Standard output means "whatever file descriptor 1 points to".  That
> > could be a file, a pipe, a terminal (character device), etc.
> 
> Why pick on 1?

It's the definition.  Standard input is FD 0, standard output is FD 1,
and standard error is FD 2.

> . It demonstrates the shell syntax element required (&) in order to
>   avoid truncating the file, rather than shred overwriting it.

You are confused.  You're making assumptions about shell syntax that
are simply not true.

> > > >    A FILE of ‘-’ denotes standard output.  The intended use of this is
> > > >    to shred a removed temporary file.  For example:
> > > > 
> > > >       i=$(mktemp)
> > > >       exec 3<>"$i"
> > > >       rm -- "$i"
> > > >       echo "Hello, world" >&3
> > > >       shred - >&3
> > > >       exec 3>-

> Ironic that it truncates a file, and then immediately warns against
> truncating a file instead of shredding it.

No.  This is not what it does (if we fix the bug).

Let me write out the example again, but with the bug fixed, and then
explain what each line does, because apparently there is a LOT of
misunderstanding here.

  i=$(mktemp)
  exec 3<>"$i"
  rm -- "$i"
  echo "Hello, world" >&3
  shred - >&3
  exec 3>&-

The first line runs mktemp(1), which is a GNU coreutils program that
creates a temporary file, and then writes its name to standard output.
The shell syntax grabs that name and stores it in the variable "i".

So, after line 1, we have an (empty) temporary file, which was created
by a child process that has terminated.  We have its name in a variable.

Creation of temporary files works a little bit differently in shell
scripts than it does in regular programs.  In most other languages,
you would call a library function that creates the temporary file
(keeping it open), optionally unlinks it, and returns the open file
descriptor to you for use.  But you can't do that in a shell script
that needs an external program to do the file creation.  So we have this
slightly clumsy approach.

The second line opens this file for reading and writing, and ensures
that file descriptor 3 points to it.  It's important to understand that
while "exec 3>$i" would have truncated the file's contents, "exec 3<>$i"
does not.  Of course, there wasn't any content to truncate, since it was
created empty, but that's not the important part.  The important part
is that this FD is opened for read+write, allowing the temporary file
to be used for storage *and* retrieval.  We aren't doing any retrieval
in this example, but it could be done, with specialized tools.

The third line unlinks the file from the file system.  However, the shell
still has an open file descriptor which points to the file.  Therefore,
the file is still accessible through this FD.  Its inode is not recycled,
and any blocks containing file content are not marked for reuse.

This "unlink before using" technique is traditional on Unix systems.
It allows you to bypass setting up an exit handler to clean up the
temporary file.  Once the open file descriptor is closed, the file
system will mark the inode and any blocks as ready for reuse.  Even if
the script is killed by SIGKILL, that cleanup will still happen.

The fourth line writes some content via the open file descriptor 3.  At
this point, our unlinked file now has data in it.  Presumably this data
is super private, and we don't want anyone to be able to recover it.
When the script exits, the open file descriptor will close, and the file
system will mark the file's blocks as reusable, but it won't *actually*
reuse them until something else comes along and claims them.  But that's
what shred is designed for.

The fifth line calls shred(1), instructing it to destroy the content
that's in the unlinked file.  Since the file is unlinked, it has no name,
and therefore shred can't be *given* a name.  However, we have a file
descriptor that points to it.  So, what we *can* do is point standard
output to the file (that's what >&3 does), and then tell shred to destroy
the file that's pointed to by stdout.

Shred will determine the size of the file, then write data to the file,
rewind, write data again, etc.  On a traditional hard drive, that will
overwrite the original private information.  On modern devices, it may
not.

Finally, the sixth line closes file descriptor 3.  Doing this frees the
file's inode and blocks, allowing them to be reused by a future program.

The key to understanding this example is a firm grasp of the shell's
redirection syntax and semantics.

Let's start with what people normally see:  3>filename

That opens "filename" for writing with truncation, and ensures that FD 3
points to it, so the script can do things to it by referencing FD 3.
If the file doesn't exist, it'll be created.  If it does exist, it'll be
truncated.

Next, we have this guy:   3<>filename

This opens "filename" for reading and writing, without truncation.  If
the file exists, it'll be opened, but not truncated.  If the file doesn't
exist, it'll be created.

Third, we've got this one:   >&3

This is a shorthand for 1>&3 (the 1 is implied if missing), which is a
file descriptor duplication.  What this does is find where FD 3 is
currently pointing, and make FD 1 *also* point there.

So when we run a command like:    echo "Hello, world" >&3

What that does is find where FD 3 is pointing, make FD 1 (stdout) also
point there, and then run an echo command, which writes to stdout.
The end result is that echo puts some content into wherever FD 3 is
pointing.

This duplication is also used in the shred command:   shred - >&3

Again, this command finds out where FD 3 is pointing, makes FD 1 (stdout)
point to the same place, and then runs "shred -" which asks shred
to overwrite the file that's pointed to by stdout.

Finally, we have this guy:   3>&-

This closes file descriptor 3.  It doesn't have to be 3<>&- even though
FD 3 was originally opened for bidirectional access.  3>&- is sufficient
no matter how the FD was opened.

Reply via email to