PR lib/57828 (https://gnats.NetBSD.org/57828) proposes changing an
edge case of the strtoi(3) function so that if ERANGE and ENOTSUP are
both applicable then it should fail with ERANGE rather than ENOTSUP.

This is the case when the number is out of range, and there's trailing
garbage -- e.g., s="42z", min=3, max=7.


The rationale is that the caller can trivially detect the ENOTSUP
condition (test end[0] != '\0'), but not the ERANGE condition (short
of parsing the number again, which would defeat the purpose).  And
applications may find the out-of-range case more significant than the
trailing garbage case.

For example, libutil's pidfile_read(3) tries to parse a pid on a
single line, so any file whose content matches /[0-9]+\n/ with the
resulting number in the interval [1, INT_MAX] is accepted, and one
might hope that anything else should be rejected:

https://nxr.netbsd.org/xref/src/lib/libutil/pidfile.c?r=1.16#132

But if the pidfile contains, e.g., `999999999999999999999999999\n',
pidfile_read(3) as implemented will accept this -- and interpret it as
pid 2147483647.  If it's `-999999999999999999999999999\n',
pidfile_read(3) will accept it as pid 1.


Counterarguments:
- We've had this semantics since NetBSD 7, so changing it is risky --
  we'd have to audit all the existing code that uses it.
- There are copies of the code in various places and they might
  diverge.
- I searched github for /[^a-zA-Z0-9_]strtoi[^a-zA-Z0-9_]/ and came up
  with almost 16k results, so auditing that myself isn't going to
  happen unless we can narrow it down a lot.
- If you need to act on the trailing stuff, maybe it's better to just
  use a real lexer instead of relying on edge cases of error cases.
- If you rely on this change, your code won't work in any of the
  deployed implementations of strtoi anyway.


Thoughts?

The attached patch implements the proposed change in libc and the
automatic tests.
>From a64349a5ac209d9967ba3c5d6a62aa243f44f603 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell <riastr...@netbsd.org>
Date: Sat, 20 Jan 2024 18:17:27 +0000
Subject: [PATCH 1/2] strtoi(3): Test for ERANGE before ENOTSUP.

Expect failure because that's not what we implemented.

PR lib/57828
---
 tests/lib/libc/stdlib/t_strtoi.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/tests/lib/libc/stdlib/t_strtoi.c b/tests/lib/libc/stdlib/t_strtoi.c
index 3301058a5be1..093b71abe04a 100644
--- a/tests/lib/libc/stdlib/t_strtoi.c
+++ b/tests/lib/libc/stdlib/t_strtoi.c
@@ -313,14 +313,9 @@ ATF_TC_BODY(strtoi_erroredges, tc)
                { "+5z",         5, 0, "z", 3, 7, ENOTSUP },
                { "+6z",         6, 0, "z", 3, 7, ENOTSUP },
                { "+7z",         7, 0, "z", 3, 7, ENOTSUP },
-               /*
-                * PR lib/57828 suggests these should yield ERANGE, not
-                * ENOTSUP, because ENOTSUP can be discovered
-                * separately by the caller anyway.
-                */
-               { "-42z",        3, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */
-               { "2z",          3, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */
-               { "42z",         7, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */
+               { "-42z",        3, 0, "z", 3, 7, ERANGE },
+               { "2z",          3, 0, "z", 3, 7, ERANGE },
+               { "42z",         7, 0, "z", 3, 7, ERANGE },
        };
 
        intmax_t rv;
@@ -330,6 +325,9 @@ ATF_TC_BODY(strtoi_erroredges, tc)
 
        for (i = 0; i < __arraycount(t); i++) {
 
+               if (t[i].rstatus == ERANGE)
+                       atf_tc_expect_fail("PR lib/57828");
+
                errno = 0;
                rv = strtoi(t[i].str, &end, t[i].base, t[i].lo, t[i].hi, &e);
 

>From ee1b6083ea09f569e95ae5137b5fe870beb9d017 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell <riastr...@netbsd.org>
Date: Sat, 20 Jan 2024 18:19:34 +0000
Subject: [PATCH 2/2] strtoi(3): Return ERANGE before ENOTSUP if both are
 applicable.

Caller of strtoi can trivially detect the ENOTSUP case themselves by
checking whether end[0] == '\0', but they can't trivially detect the
ERANGE case (short of doing their own parsing which would defeat the
purpose of using strtoi(3)).

PR lib/57828
---
 common/lib/libc/stdlib/_strtoi.h | 9 ++++++---
 tests/lib/libc/stdlib/t_strtoi.c | 3 ---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/common/lib/libc/stdlib/_strtoi.h b/common/lib/libc/stdlib/_strtoi.h
index e6b0ce778b60..d48d58350050 100644
--- a/common/lib/libc/stdlib/_strtoi.h
+++ b/common/lib/libc/stdlib/_strtoi.h
@@ -104,9 +104,6 @@ INT_FUNCNAME(_int_, _FUNCNAME, _l)(const char * __restrict 
nptr,
                /* No digits were found */
                if (nptr == *endptr)
                        *rstatus = ECANCELED;
-               /* There are further characters after number */
-               else if (**endptr != '\0')
-                       *rstatus = ENOTSUP;
        }
 
        if (im < lo) {
@@ -120,6 +117,12 @@ INT_FUNCNAME(_int_, _FUNCNAME, _l)(const char * __restrict 
nptr,
                return hi;
        }
 
+       if (*rstatus == 0) {
+               /* There are further characters after number */
+               if (**endptr != '\0')
+                       *rstatus = ENOTSUP;
+       }
+
        return im;
 }
 
diff --git a/tests/lib/libc/stdlib/t_strtoi.c b/tests/lib/libc/stdlib/t_strtoi.c
index 093b71abe04a..ee9b5a3e899f 100644
--- a/tests/lib/libc/stdlib/t_strtoi.c
+++ b/tests/lib/libc/stdlib/t_strtoi.c
@@ -325,9 +325,6 @@ ATF_TC_BODY(strtoi_erroredges, tc)
 
        for (i = 0; i < __arraycount(t); i++) {
 
-               if (t[i].rstatus == ERANGE)
-                       atf_tc_expect_fail("PR lib/57828");
-
                errno = 0;
                rv = strtoi(t[i].str, &end, t[i].base, t[i].lo, t[i].hi, &e);
 

Reply via email to