Re: cp: odd behaviour; does not preserve symlink

Rob Landley Tue, 16 Jun 2009 13:37:01 -0700

On Tuesday 09 June 2009 13:34:33 Denys Vlasenko wrote:
> On Tue, Jun 9, 2009 at 11:12 AM, Cristian
>
> Ionescu-Idbohrn<[email protected]> wrote:
> > this is what the gnu-tools do:
> >
> >        $ mkdir abc
> >        $ cd abc
> >
> > create a dangling symlink:
> >
> >        $ ln -sf /tmp/foo
> >        $ ls -l
> >        lrwxrwxrwx 1 me users 8 Jun  9 10:35 foo -> /tmp/foo
> >        $ ls -l /tmp/foo
> >        ls: cannot access /tmp/foo: No such file or directory
> >
> >        $ cp /etc/resolv.conf foo
> >        cp: not writing through dangling symlink `foo'
> >        $ touch foo
> >        $ ls -l /tmp/foo
> >        -rw-r--r-- 1 me users 0 Jun  9 10:44 /tmp/foo
> >
> > /etc/resolv.conf is written through the symlink:
> >
> >        $ cp /etc/resolv.conf foo
> >        $ ls -l /tmp/foo
> >        -rw-r--r-- 1 me users 125 Jun  9 10:46 /tmp/foo
> >        $ ls -l
> >        lrwxrwxrwx 1 me users 8 Jun  9 10:35 foo -> /tmp/foo
> >
> > --- busybox ----------------------------------------
> >
> > BusyBox v1.14.1 (2009-06-08 17:34:46 CEST) multi-call binary
> >
> >        $ mkdir abc
> >        $ cd abc
> >        # ln -sf /tmp/foo
> >        # ls -l
> >        lrwxrwxrwx    1 root     root            8 Jun  9 08:50 foo ->
> > /tmp/foo # ls -l /tmp/foo
> >        ls: /tmp/foo: No such file or directory
> >
> > 'cp' does not refuse to write through dangling symlink; overwrites the
> > symlink with a file:
> >
> >        # cp /etc/resolv.conf foo
> >        # echo $?
> >        0
> >        # ls -l
> >        -rw-r--r--    1 root     root           71 Jun  9 08:54 foo
> >
> >
> > 'cp' does not preserve the symlink eve if it's _not_ a dangling
> > symlink:
> >
> >        # rm foo
> >        # touch /tmp/foo
> >        # ls -l /tmp/foo
> >        -rw-r--r--    1 root     root            0 Jun  9 08:58 /tmp/foo
> >        # ln -s /tmp/foo
> >        # ls -l
> >        lrwxrwxrwx    1 root     root            8 Jun  9 08:58 foo ->
> > /tmp/foo # cp /etc/resolv.conf foo
> >        # echo $?
> >        0
> >        # ls -l
> >        -rw-r--r--    1 root     root           71 Jun  9 08:58 foo
> >
> > Anyone else seeing this?
>
> Yes. It's logical. cp *copies files*. IOW: it *creates a copy
> of an existing file*. Copy of a file should be a file.


Denys is on record as wanting the busybox copy command to work differently than 
the usual linux copy command, because he believes surprising developers with 
unexpected behavior somehow has security benefits.

  http://lists.busybox.net/pipermail/busybox/2008-March/064814.html

The busybox cp command wont' break a hard links if the destination is a hard 
link, but symlinks are treated differently, for no apparent reason.

I vaguely recall writing up a long email when this discussion first came up, 
and then giving up and deleting it and just implementing cp in toybox instead.

Personally if I was going to make wildly divergent behavior I'd have a config 
option for it, which Denys could set to his default in defconfig and everybody 
else in the world could override to what people actually expect...

> I know that POSIX and friends do not do that. I do not know
> why they chose to do stupid things and have security risks
> instead of prescribing that cp is a copy operation.

I remember back under Solaris I used to do something like:

  cp /dev/fd0 /dev/fd1

So that isn't a copy operation?  It's not copying files, it's copying disks.  
But it worked.

When you "cp disk.img /dev/hda2" you're copying contents, and the target isn't 
a file, but it's expected to work.  The target can be a named pipe too.

You've also decided that copying the _contents_ of files doesn't count as 
copying, and should be done by the "cat" command.  (In which case it's _not_ 
done by the cat command, it's actually done by shell redirection.)  I note 
that "cat" is short for "concatenate", and thus logically you shouldn't be 
able to use that on only _one_ file...

> If you want to dump bytes into an arbitrary entry in a directory,
> the natural way is "cat >dest".

You can also do it with dd, or tee, or with things like sed and awk and grep 
in NOP mode (yes, binutils ./configure actually _DID_ use awk as a nop copy, 
and it broke us...)

The ability to use any of those aren't arguments to prevent _others_ from 
being able to do it. 

I tried not to assume I knew better than other people with root access, or 
that 20 years of standard behavior should be changed because obviously people 
can't handle it.  I admit I split catv off from cat, but that was based on a 
paper one of the original Unix developers (Rob Pike of Bell Labs) wrote back 
in 1983) and boils down to choosing not to implement an option we didn't 
already have...

*shrug*  It's Denys's project, I can always patch my own copy.  But I'd like 
to make it clear I disagree with this call.

P.S.  When I did cp in toybox, I found 8 gazillion fun little edge cases, such 
as these:

  http://landley.net/notes-2008.html#18-02-2008

Or this one from Feb 16:

  Ignoring it all and banging on cp at the moment.  Already found a test
  case that drives busybox 1.2.2 nuts.  (Busybox has probably fixed it by now,
  but I'm trying to avoid _having_ that problem to begin with.  The problem
  being that -r automatically assumes -d for all levels past the first, so
  if you do "mkdir sub; ln -s .. sub/sub; cp -r sub sub2" it doesn't fill up
  your hard drive.  Which means that dirtree_add_node() _does_ always need
  to use lstat(), regardless of what the top level logic is doing.)

or this bit from February 20 talking about security implications:

  And still poking at cp -p, and there's a security hiccup.  There's no
  obvious way to close the hole between create, chown, and chmod.  The problem
  is really directories.  First of all, mkdir is a syscall so we can't
  atomically open it and then get a filehandle to it and be sure the object we
  opened is the one we just created.  Secondly, the directoy has to have the
  write bit set in order to create anything _in_ the directory, so if we
  recursively copy a read only directory we have to leave the write bit set,
  and then go back and switch it off later for -p

  I'd like to be able to use the same codepath for directories as for files,\
  and the race-free way to do this for files is with fchmod() and
  fchown()...

  Ok.  I'm cheating.  I can use open(name,0) to get a filehandle to a
  directory (which I can't read from, but which I can call fchown() on). 
  There's still a race condition between directory creation and chown, but
  that's inherent in mkdir not returning a filehandle and I can mitigate it a
  bit so I at least makesure I apply the permissions to a _directory_ and not
  a file.  And a race condition on setting the timestamp doesn't have obvious
  security implications.

Doing _anything_ as root requires being very careful.  If you want to start 
checking cp for security implications: that's a can of worms. :)

Oh, and -H and -L are _not_ the same thing, they differ when you're using -r.  
-H means follow symlinks in top level files explicitly listed on the command 
line, but not when symlinks you find in directories you've recursed into.  -L 
means follow symlinks both on the command line and in the recursed 
directories.  (Admittedly this could lead to endless loops and fill up your 
drive, which is why the gnu cp has tests for "cannot copy cyclic symbolic 
link".  You could probably just cheese your way out of the edge case and just 
preserve the symlnk status of any file with ".." as a path component which 
points to a directory...)

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Re: cp: odd behaviour; does not preserve symlink

Reply via email to