On Wed, Jul 05, 2017 at 07:12:28PM -0400, Ted Unangst wrote: > Olivier Antoine wrote: > > Hi all, > > > > Recently a bug has been identified in Tor: > > > > https://trac.torproject.org/projects/tor/ticket/22789 > > > > As comments were made, questions were raised about the use of strtol(3), > > the different interpretations of the standard and their implementation. > > > > To summarize, the question revolves around the processing of strings in > > base=16 and with the optional prefix '0x'. > > > > l = strtol ("0xquux", & rest, 16); > > > > Produce > > l=0 rest=0xquux on OpenBSD > > l=0 rest=xquux on Linux > > > > Do specialists of the standard or developers have an opinion on this point > > of detail? > > Is there a defined behavior? > > My opinion is that well written code would avoid feeding ambigious strings to > strtol. Today's it's 0xquux and tomorrow it's 0xaquux and now you have a > problem. > > But, let's read > http://pubs.opengroup.org/onlinepubs/9699919799/functions/strtol.html > > It's actually unclear IMO. But I don't see anything prohibiting interpreting > the string as an optional prefix with an empty body.
Well: The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2008 defers to the ISO C standard. This is Sparta^WISO C we're talking about. The wording of posix is actually irrelevant. ISO C 99 is waaays clearer. 7.20.1.4 (3) If the value of base is zero, the expected form of the subject sequence is that of an integer constant *as described in 6.4.4.1*, optionally preceded by a plus or minus sign but not including an integer suffix [...] 7.20.1.4 (4) The subject sequence is defined as the longest intial subsequence of the input string [...] *that is of the expected form*. 6.4.4.1 is a grammar integer-constant: [...] hexadecimal-constant integer-suffix_opt hexadecimal-constant: hexadecimal-prefix hexadecimal-digit hexadecimal-constant hexadecimal-digit There is no wiggle room there. That grammar is explicit that there must be at least one hexadecimal digit after the prefix.