On Mon, 13 Nov 2017, Warner Losh wrote:
Add notes about overlapping copies.
Add notes to each of these that specifically state that results are
undefined if the strings overlap. In the case of memcpy, we document
the overlapping behavior on FreeBSD (pre-existing). For str*, it is
left unspecified, however, since the default (and x86) implementations
do not handle overlapping strings properly.
PR: 223653
Sponsored by: Netflix
Modified:
head/lib/libc/string/memcpy.3
head/lib/libc/string/strcat.3
head/lib/libc/string/strcpy.3
C standards and POSIX have much better wording.
Modified: head/lib/libc/string/memcpy.3
==============================================================================
--- head/lib/libc/string/memcpy.3 Mon Nov 13 16:53:36 2017
(r325764)
+++ head/lib/libc/string/memcpy.3 Mon Nov 13 17:04:44 2017
(r325765)
@@ -54,6 +54,11 @@ bytes from string
.Fa src
to string
.Fa dst .
+If
+.Fa src
+and
+.Fa dst
+overlap, the results are not defined.
.Sh RETURN VALUES
The
.Fn memcpy
The bugs here are:
- src and dst are pointers. Overlap for pointers makes no sense except
for pointers to single objects but the pointers here are not for that.
Standards use the correct description "objects that overlap".
- style bug: verboseness in the nonsense, by naming the non-objects and
then having to mark up their names. A full description would be much
longer, especially for non-counted copies. For strcpy(), the relevant
obects are the array of char beginning at .Fa src and ending at its
terminating NUL, and the array of char beginning at .Fa dst with the
same number of elements as the source array. Standards omit all details
and say simple "objects". memcpy() only has a count so it is simpler.
strncpy() has a count in addition to the NUL terminators, so it is more
complicated.
- use of the wrong term "results". The correct term is "behavior". Results
are are worst implementation-defined in Standards, since it is almost
impossible for a result to be undefined without the whole behaviour also
being undefined. These functions have only 1 result. In Standard wording,
function results are return values, not the whole behavior of the function.
- use of the non-technical term "not defined". Standards use the technical
term "undefined".
The Standards wording is "If copying takes place between objects that
overlap, the behaviour is undefined".
Modified: head/lib/libc/string/strcat.3
Modified: head/lib/libc/string/strcpy.3
- as above, except "undefined" is spelled correctly
Grepping for "overlap" in draft C99 shows the following other functions
with overlap giving undefined behavior:
snprintf, sprintf, sscanf, vsnprintf, vsprintf,
mbstowcs, wcstombs,
strxfrm, strftime.
all functions in <wchar.h> where overlap might occur, except when
explicitly specified (seems to be only explicitly specified for
wmemmove)
The term "overlap" occurs in 193 section 3 man pages in -current (counting
links as 1 each). It isi still never used for any of the above functions.
It is used in the following related man pages:
- bcmp.3. This use is nonsense. Overlap doesn't cause any problems for
bcmp().
- bcopy.3 and memmove.3. This use is not nonsense, but its wording is much
worse than in Standards. It says "The .Fn function copies .Fa len bytes
from string .Fa src to string .Fa dst. The two strings may overlap.".
But these functions don't even copy strings, and "may overlap" gives no
details. Standards say "The .Fn function copies .Fn n characters from
the object pointed to be .Fa s2 into the object pointed to by .Fa s1.
[Complicated 5-line sentence saying that the copying acts as if it is
done through a temporary object that doesn't overlap.]". I think standards
say somewhere, but man pages say nowhere, that copying between
non-overlapping objects has the defined behaviour of copying the bytes
(u_char's) it is done by assignment of bytes or a function like memmove()
(copying of general objects is more complicated since it may skip padding
or otherwise fail to preserve the representation). This is needed so that
the 5-line sentence doesn't need to be much longer.
- memcpy.3. It was already documented in the BUGS section that memcpy() is
implemented as bcopy() and therefore the "strings" "may overlap" [then
portability considerations due to this]. This is buggier than before:
- old bug: only the MI implementation of memcpy() clearly implements
memcpy() as bcopy(). The amd64, i386 and arm MD implementations
do the same. mips is backwards for bcopy() and implements it as
memmove(). mips has 2 separate memcpy*.S files, one for plain arm
and one for xscale. The plain arm one is smaller than memmove.S,
so presumably doesn't support overlapped copies. The xscale one
is much larger, so has space for overlapped copies and many optimizations
that no longer work. I don't know what it does. (The i386 and amd64
ones have a few small optimizations that stopped working 25 years ago.
At least they are short.)
- new bug: the addition is inconsistent with the BUGS section.
- strlcpy.c. This uses a combination of the C99 wording for strcpy() and
buggy wording involving the pointer names (for both lcat and lcpy):
If the .Fa src and .Fa dst strings overlap, the behavior is undefined.
Now it is even better to not give the details about the objects, since
at least for dst, the object isn't the string pointed to by dst --
it is the char array of length MIN(dstsize, strlen(dst) + 1). Similarly
for old strncat() and strncpy() and not so old snprintf(). For the
standard 'n' functions, the object is clearly limited by the dstsize arg
and there is no problem in theory or practice if the dst string overlaps
beyond the part that is written to, so it is just a bug in the
specification = man page of the 'l' functions to have a stricter limit
on overlap.
Grepping for "overlap" in draft POSIX.1-2001 shows the following additional
functions with overlap giving undefined behavior:
bcopy: POSIX duplicates the C99 wording for memcpy, but for bcopy it
has much the same bugs as this change: it uses "the area pointed to
be .Fa s1", etc.
memccpy, swab
independent details for functions in <wchar.h> (but POSIX combines the
details for at least sprintf() and snprintf()).
Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"