Hi Bruno,

On 2026-02-25T08:09:46+0100, Bruno Haible wrote:
> Hi Alejandro,
> 
> > > Oh wait. Should we have called the new function stpnul() instead of 
> > > strnul()?
> > > Because it returns a pointer.
> > 
> > ... We have other
> > existing APIs that only have a variant that return an offset pointer,
> > and their name is str*(), strchr(3) being the canonical example.
> > stp*() would only be necessary where there's the two variants with the
> > same name.
> 
> OK for keeping strnul then.
> 
> Among the str* functions that return a pointer (strchr, strrchr, strdup,
> strpbrk, strstr), there is in particular strpbrk, which can be defined as
> 
>   char *
>   strpbrk (const char *s, const char *char_set)
>   {
>     size_t n = strcspn (s, char_set);
>     return s[n] ? (char *) &s[n] : NULL;
>   }

I was wondering if I should propose adding an alias for strpbrk(3).
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3670.txt>

Plan9 has a synonym called strchrs(), and that is indeed a better name
for strpbrk(3), since it's just a plural version of strchr(3).

> 
> So, as I understand it,
>   - stpcspn would be a variant of strpbrk,

Actually, stpcspn() is a plural variant of strchrnul(3).  We could call
it strchrsnul(), if we were deriving it from that family.

The following calls are equivalent:

        strchrnul(s, 'x');
        stpcspn(s, "x");

>   - but stpspn is not such a variant.

stpspn() would not be a variant.  Unless we'd call it strcchrsnul(), but
the name becomes quite unreadable.

> 
> * If we add only the stpspn function, we'll have tricky-to-remember
>   differences between stpspn and strpbrk.

Maybe by having the synonym strchrs() for strpbrk(3) we could help
remember what strpbrk(3) does, which is essentially the same as
strchr(3), but with several characters.

Something like this:

        SYNOPSIS
             #include <string.h>

             char *strchrs(const char *s, const char *accept);
             char *strpbrk(const char *s, const char *accept);

        DESCRIPTION
             The strchrs() function locates the first occurrence in the
             string s of any of the characters in the string accept.
             It's similar to strchr(3), but can search for several
             characters.

        RETURN VALUE
             The  strchrs() function returns a pointer to the character
             in s that matches one of the bytes in accept, or NULL if no
             such character is found.

        ...

        HISTORY
             strpbrk() is an old name for the same function.  It was
             renamed, and the old name is kept for compatibility.

> * Whereas if we add both the stpspn and stpcspn functions, we'll have
>   two functions stpcspn, strpbrk that are merely variants of each other.

It would be like having strchr(3) and strchrnul(3).  It's not a problem,
as long as the names make it obvious what they do.  I think adding
strchrs() and updating the manual page for strpbrk(3) would help in
that.

> Neither of which is super desirable.
> 
> I'm thinking we should therefore leave the current asymmetric situation
> (strspn, strcspn, strpbrk) alone. Especially since either case can also
> be open-coded with a 'for' loop.

Those are quite dangerous.  It forces you to type more, which can lead
to accidents.  In fact, I've seen such an accident this week, from an
OpenBSD contributor, which is what reminded me to propose it here.

OpenBSD doesn't have strchrnul(3), and so when they see a project using
it, they replace it by a similar function they do have.  They replace

        foo = strchrnul(bar, 'x');
by
        foo = bar + strcspn(bar, "x");

But, it had a typo and wrote:
        foo = foo + strcspn(bar, "x");

which:

-  It is actually quite easy to make that mistake.
-  It is quite hard to notice, unless you know what you're looking for.
-  The compiler can't diagnose, because foo and bar have the same type.

I think that alone deserves adding these APIs, and also tells that
OpenBSD might be taking risks that I wouldn't want to take.

> And just 5 opportunities for stpspn()
> in gnulib/lib/ — that does not seem worth the trouble (for users) of
> learning and remembering this function.

In shadow utils we have (measured some time ago):

        $ grep -rn 'strspn *(' | grep -v -e lib/string/strspn/ | wc -l
        11
        $ grep -rn 'stpspn *(' | grep -v -e lib/string/strspn/ | wc -l
        10
        $ grep -rn 'strcspn *(' | grep -v -e lib/string/strspn/ | wc -l
        0
        $ grep -rn 'stpcspn *(' | grep -v -e lib/string/strspn/ | wc -l
        0
        $ grep -rn 'strrspn_ *(' | grep -v -e lib/string/strspn/ | wc -l
        0
        $ grep -rn 'stprspn *(' | grep -v -e lib/string/strspn/ | wc -l
        4
        $ grep -rn 'strrcspn *(' | grep -v -e lib/string/strspn/ | wc -l
        0
        $ grep -rn 'stprcspn *(' | grep -v -e lib/string/strspn/ | wc -l
        1


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es>

Attachment: signature.asc
Description: PGP signature

Reply via email to