> From: Reimar Döffinger [mailto:[email protected]] > > On 24.06.2015, at 01:49, Craig McQueen <[email protected]> > wrote: > > >> From: Peter Korsgaard [mailto:[email protected]] On Behalf Of Peter > >> > >>>>>>> "Peter" == Peter Korsgaard <[email protected]> writes: > >> > >>>>>>> "Matt" == Matt Johnston <[email protected]> writes: > >>>> I see what you mean. I'll update scp to OpenSSH's latest - >> > >>>> there are a > >> few changes to be merged. > >> > >>> FYI, openssh scp has a similar problem here: > >> > >>> open("vendor", O_RDONLY|O_NONBLOCK) = 3 > >>> fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0 > >>> fcntl(3, F_GETFL) = 0x8800 (flags > >> O_RDONLY|O_NONBLOCK|O_LARGEFILE) > >>> fcntl(3, F_SETFL, O_RDONLY|O_LARGEFILE) = 0 > >>> write(6, "C0444 4096 vendor\n", 18) = 18 > >>> read(3, "0x8086\n", 4096) = 7 > >>> read(3, "", 4089) = 0 > >>> write(6, > >> "0x8086\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., > >> 4096) = > >> 4096 > >> > >>> This is with openssh 6.7p1-6. > >> > >> But thinking further about it, I guess that's just an artifact of the > >> protocol and there isn't any way around it. > > > > I guess the question is, should the copied file's size be that reported in > > the > directory listing, or what is actually successfully read from the file? > > > > If I use 'cp' to copy a sysfs file, then the copy contains the bytes > > actually > read. E.g.: > > > > # ls -al /sys/class/net/eth0/address > > -r--r--r-- 1 root root 4096 Jun 23 23:43 > > /sys/class/net/eth0/address > > # cp /sys/class/net/eth0/address . > > # ls -al > > drwxr-xr-x 1 root root 4096 Jun 23 23:43 . > > drwxr-xr-x 1 root root 4096 Jun 23 06:51 .. > > -r--r--r-- 1 root root 18 Jun 23 23:43 address > > > > So with 'cp' I get a file length of 18 (the bytes successfully read), not > > 4096. > > > > As a user, I expect 'scp' to do the same thing. Does the protocol make that > difficult somehow? > > Well, as I understand it the file size is sent first. > > If you cannot trust stat, you'd have read the whole file first and even > buffer it > (in case it changes - though I guess that race condition also exists with > current > stat usage).
So in that case, 'scp' wouldn't work well with: * Files that are appended or truncated while they're being 'scp'ed. * sysfs files. * procfs files. * Named pipes. * char devices. It does sound as though the protocol was designed only with static regular files in mind. That's a pity. > I see a few things that could (and probably should) be done: > 1) if it works better, use lseek to end to get file size instead of stat Not sure if that would work better. > 2) when read fails, zero the buffer (so at least there are only extra 0s, not > 'random' data) That seems important, otherwise you could have a security issue vaguely like POODLE. > 3) if the file size is less than some limit (64kB? - would be easier to make > this > the same as the normal buffer size, 4kB is a bit small anyway) do not rely on > stat but just read the whole file to get the size without race conditions. > Could > be done by just always before anything else reading 64kB and if the read is > short just skip anything else. That would fix some use-cases, but perhaps make other use-case failures more obscure and surprising. > 4) print a warning when read fails (if the protocol allows it) Warnings are good as long as there is a human running it who can read it; not so good if it's a component of a larger software system. > Sorry for the long text, especially as I have no intention to actually work > on it. > Hope someone is motivated to fix it :) It would be great if the situation could be improved. It is certainly best for users to be able to trust essential system tools like scp. Having file size sent in the protocol up-front does sound like an awkward protocol design, which cramps the solution space, unfortunately. -- Craig McQueen
