Hi Branden,

On 2026-02-22T15:32:07-0600, G. Branden Robinson wrote:
[...]
> > BTW, off-topic, but I do have an nprintf() function that is quite
> > different from that.  It calculates the length of the formatted string,
> 
> Please name it something else.  There's already too much confusion
> around the presence of "n" in string-handling functions (not just in
> libc, but in curses, too), and it _usually_ indicates the imposition of
> a limit, not a request for measurement.

I agree with string-handling functions, and thus with snything called
sn*() or strn*().  However, an 'n' by itself is not associated to
strings or nonstrings.

The reason I chose the name with an n is the usual -n flag for no-op
runs of commands, and also because n happens to be a usual name for a
length.

If you propose a better name for it, I might take it.  But I really like
nprintf().

Another thing to keep in mind is that strn*() functions have absolutely
no relation to sn*() functions (of which only snprintf(3) exists).
strn*() functions work with nonstrings, while sn*() functions work with
strings.

You may be interested in a function I'll try to get into gnulib
eventually.  It's stprintf(), where the t stands for truncation.
Here's the implementation:

        int
        stprintf(ssize_t size;
            char s[restrict size], ssize_t size, const char *restrict fmt, ...)
        {
                int      len;
                va_list  ap;

                va_start(ap, fmt);
                len = vstprintf(s, size, fmt, ap);
                va_end(ap);

                return len;
        }

        int
        vstprintf(ssize_t size;
            char s[restrict size], ssize_t size, const char *restrict fmt, 
va_list ap)
        {
                int  len;

                if (size == 0)
                        abort();

                len = vsnprintf(s, size, fmt, ap);
                if (len == -1)
                        return -1;
                if (len >= size) {
                        errno = E2BIG;
                        return -1;
                }

                return len;
        }

It is essentially the same as snprintf(3), but returns -1 on truncation
or error, which is much less error-prone, and easier to test.  It
differentiates truncation from error with E2BIG.

> [much snippage]
> > You personally.  By adding explicit casts, you've stated "I'm aware
> > that [this function] has a return value, and that I don't need it".
> > However, now you claim to not have written that code, and thus not
> > really be in a position to do that claim.  It's contradicting
> > information.  That's why I think it would be more prudent to not make
> > the claim.
> 
> Now that you cite the specific commit, I see what you're talking about.
> 
> [much snippage]
> > While the title mentions printf(3), I really meant printf(3) family,
> > which is what your commit says; it was an unfortunate and incorrect
> > abbreviation of the title.
> 
> Agreed.  You copied my bad example from a commit message.
> 
> > Your commit *does* cover snprintf(3):
> > 
> >     $ git show 4b7a0fe5ab5d | grep snprintf
> >     -  snprintf(encoding, sz, "%s-%s", parsed.CharSetRegistry,
> >     +  (void) snprintf(encoding, sz, "%s-%s", parsed.CharSetRegistry,
> 
> Agreed, and that was a mistake.
> 
> _Every_ other instance in that commit...
> 
> $ COLUMNS=72 git show --stat 4b7a0fe5ab5d
> commit 4b7a0fe5ab5d5155bd499cf9506a91a1f4bc0125
> Author: G. Branden Robinson <[email protected]>
> Date:   Sat Dec 6 18:26:02 2025 -0600
> 
>     src/utils/xtotroff/xtotroff.c: Fix code style nit.
> 
>     * src/utils/xtotroff/xtotroff.c:
>       (CanonicalizeFontName, FontNamesAmbiguous, MapFont, main): Explicitly
>       cast unused return values of printf(3)-family functions to `void`.
> 
>  ChangeLog                     |   7 +++
>  src/utils/xtotroff/xtotroff.c | 102 ++++++++++++++++++----------------
>  2 files changed, 61 insertions(+), 48 deletions(-)
> 
> ...was a change to a stdio `FILE`-oriented *printf.

Yup.

> 
> I wonder how susceptible LLM-based code generators are to similar
> mistakes.
> 
> I don't remember making this decision consciously.  Possibly I went into
> robotic replacement mode.  I see so much that I want to improve in groff
> that I sometimes lull myself into a coal-shoveling stupor.  It's not
> good.  When I feel the drudge setting in, I should stop and (a)
> decompose the problem and/or (b) write a sed script to do the work for
> me.  (b) might actually take longer when tuned to get things _just
> right_, but can be more challenging and fun.
> 
> Take away: I need to revisit that change when groff is unfrozen.  I
> don't see a _regression_ here because the return value of `snprintf()`
> already wasn't being checked.  The bug here is that I wrongly annotated
> it as being _deliberately_ discarded, and that conclusion is not
> warranted on the available evidence.
> 
> So let me amend the rule(s) we're arguing about.

I still disagree with your updated rules.  I'll reorder them for
replying.

> * Discard of s*printf() functions' return values should, by default, be
>   avoided because they _can_ lead to undesired string truncation and
>   even security problems as a result.[1]

Agree.

>   _If_ discard is done, it
>   _should_ be with an explicit cast to void _and_ a comment explaining
>   why string truncation is not a hazard or not of consequence.

The cast is still dangerous, and both the cast and the comment are prone
to become obsolete at some point in the future.  When digging into old
bugs, I often read the commit message that changed/introduced the code,
as that's the thing that's most likely to match the code at that point
in time.  I tend to ignore comments, unless they were introduced in the
same commit.

Consider the case where in the future truncation stops being okay, but
there's a cast and a comment that claims so.  And the person who wrote
those is not there to answer questions anymore.  What then?  I'd not
cast nor comment, ever.  Just write a note in the commit message, and
trust that people will use git-blame(1), or they will actually blame
you (I will).  :)

> * Another possibility is to not _ever_ discard an s*printf() function's
>   return value, but store it and then immediately use it in an
>   `assert()`, which is better even than a documentary comment.

If you want to assert that it never truncates, I think you should use
sprintf(3) with _FORTIFY_SOURCE at its highest level.  That's simpler
and just as robust.  _FORTIFY_SOURCE has run-time costs, but as we
agree, correctness is more important.

If you'll insist on not using sprintf(3), then you may find my
stprintf() to be nicer than snprintf(3), as the -1 is easier to assert:

        assert(-1 != stprintf(...));

> * I'm not sure what's best if some smarmy jerk #defines `NDEBUG`.

You could preclude that, by undefining NDEBUG immediately before
including <assert.h>, *and* including <assert.h> as the last header
always.

        #undef NDEBUG
        #include <assert.h>

> * I still think it's fine to explicitly cast {f,}printf() return values
>   to void, to document that one doesn't care if output to the stream got
>   truncated.  Often, an application is helpless in that scenario anyway.
>   (An analogous situation is: "what can I do if fclose(3) fails?")

I strongly disagree with this.  As we've seen, it's easy to accidentally
put the cast in a function that shouldn't have it.  And since casts mean
"trust me, bro, I know I don't need this", how is a reader in a position
to question a specific statement?  How do we know if you made an
accident (like in this case) vs a case where you really knew what you
were doing?

Casts turn off most compiler diagnostics, and that's bad.  My rule is:
a good program shall have zero casts.

I do have exceptions for that rule:

-  It's okay to cast a literal 0 to void, to create a void expression,
   useful in some macros.  For example:

        #define typeas(T)  typeof((T){})

        // qsort_T - sort type-safe
        #define qsort_T(T, ...)         qsort_T_(typeas(T), __VA_ARGS__)
        #define qsort_T_(T, a, n, cmp)  do                            \
        {                                                             \
                _Generic(a, T *: (void)0);                            \
                qsort(a, n, sizeof(T), cmp);                          \
        } while (0)

-  It's not okay, but may be necessary, to add casts to interface with
   APIs that are not const-correct, such as strtol(3).

        #define a2i(T, n, s, endp, base, min, max)                    \
        ({                                                            \
                T            *n_ = n;                                 \
                QChar_of(s)  **endp_ = endp;                          \
                T            min_ = min;                              \
                T            max_ = max;                              \
                                                                      \
                int  status;                                          \
                                                                      \
                *n_ = _Generic((T){},                                 \
                        short:              strtoi_,                  \
                        int:                strtoi_,                  \
                        long:               strtoi_,                  \
                        long long:          strtoi_,                  \
                        unsigned short:     strtou_noneg,             \
                        unsigned int:       strtou_noneg,             \
                        unsigned long:      strtou_noneg,             \
                        unsigned long long: strtou_noneg              \
                )(s, const_cast(char **, endp_), base, min_, max_, &status);\
                                                                      \
                if (status != 0)                                      \
                        errno = status;                               \
                -!!status;                                            \
        })

   I wrap these with a const_cast() macro, which provides some safety
   guarantees, and also allows me to grep(1) them all easily.

        #define const_cast(T, p)  _Generic(p, const T: (T) (p), default: (p))

No other exceptions are acceptable, IMO.


Cheers,
Alex

> Regards,
> Branden
> 
> [1] https://cwe.mitre.org/data/definitions/222.html



-- 
<https://www.alejandro-colomar.es>

Attachment: signature.asc
Description: PGP signature

Reply via email to