Hi Bruno, On Fri, Sep 19, 2025 at 03:12:52PM +0200, Bruno Haible wrote: > Hi Alejandro, > > > > [CCing sc22wg14] > > > > [I've removed WG14, as I think planning outside of their sight can be > > more productive.] > > That's OK. I just wanted to make the committee aware that good naming is > important and that the names in the current proposal are not good naming, > IMO.
They didn't receive your mail. The list is members-only. But I'll reflect this in the proposal when I present it. > > Since gnulib would be the first implementor, and my proposal won't be > > voted until February or March, gnulib has the chance to influence on the > > name. > > > > If you implement them as XXXprefix() and XXXsuffix(), I'll change my > > proposal, due to prior art. > > > > The committee, a priory, would refuse such common names, due to fears of > > breaking much existing code. However, if major existing libc > > implementations use them, it might turn the votes. > > > > If you start with strprefix()/suffix(), and then glibc may follow, we > > have two of the major libc implementations. musl and Bionic would > > probably follow for compatibility. > > > > But this must come from implementations. The committee will not help. > > Thanks for the insight. (I don't have much experience with WG14.) Think of the committee as a group of people that live in an Ivory Tower and prefer not fixing mistakes, for fears of the image that admitting mistakes would give to users. It's actually not a bad model. > The major reason that speaks for the name 'str_startswith', as used > in Gnulib, is: > - The term starts-with is already used for this purpose in > 10 out of 15 programming languages. Find attached a summary from > the prior knowledge summarization engine. > - The prefix 'str' shows that it's for 'char *' strings and allows > a similar function with prefix 'wcs' for 'wchar_t *' wide strings. The second point isn't exclusive of str_startswith. stRprefix() has the same property. The etymology for the name stPprefix() comes from the stp* (POSIX; e.g., stpcpy(3)) and memp* (GNU, BSD; e.g. mempcpy(3)) families of functions, which are string functions that return an offset pointer. Since these return an offset pointer, the stp* prefix is quite appropriate for stpprefix() and stpsuffix(). About names from other languages, I don't necessarily like that reasoning. We almost had _Countof() named _Lengthof() because committee members wanted that name from other languages which have it as length. That would have been harmful, as it would have mixed sizes and lengths in string-handling code, promoting off-by-one bugs. In general, I'd take other languages with a pinch of salt. Consistency with libc is important too. If I see str_startswith(), I would guess it's a projec-specific API. If I see strprefix(), I guess it's a libc API. That's because the reserved prefix is str* followed by lowercase. And even without the reservation, we have no such names in string.h, so instinct also plays a role. > Therefore, I think we should stay with that for Gnulib. No 'strprefix' > or such. > > Then, let's look for a readable and pronounceable name for the variant > that returns a pointer. > > - I tried thinking in terms of parsing, i.e. "str_parse_prefix", > but what about the one with suffix then? "str_backwardparse_suffix"? > Backward parsing is rarely seen in code. That induces confusion. Does the function reverse the string as if by rev(1) before searching? > - How about the names 'str_prefix_end' and 'str_suffix_start'? > It's descriptive. > It's pronounceable. > The names imply that they return a pointer. > The names are not very long, compared to what we already find in ISO C: > > str_prefix_end > str_suffix_start > fegetexceptflag > fe_dec_getround > fmaximum_mag_num > decodebind128 > atomic_compare_exchange > atomic_flag_test_and_set > atomic_flag_test_and_set_explicit > stdc_first_leading_zero_ull > memset_explicit However, we also need to compare them to <string.h>. They're part of the strchr(3) and strstr(3) family of functions, as they have similar semantics (although they also are related to streq() in some sense). Having consistency with their family is important, IMO. > They are not used at all [1][2] in existing code, so not a hindrance > for WG14. > > Bruno > > [1] https://codesearch.debian.net/search?q=str_prefix_end&literal=1 > [2] https://codesearch.debian.net/search?q=str_suffix_start&literal=1 On Fri, Sep 19, 2025 at 03:14:52PM +0200, Bruno Haible wrote: > Paul Eggert wrote: > > Not sure I like having bool variants that are trivial wrappers for the > > pointer variants, though. Life is already complicated enough. > > Unlike Paul, I don't mind having two different functions > - str_startswith, that returns 'bool', > - str_prefix_end, that returns 'char [const] *'. > They have different purposes, That justifies the different names. > > Having only the function that returns the pointer and telling the > programmers to use this function when in fact they want a 'bool' > > * either leads to code like > if (str_prefix_end (str, p) != NULL) > which is longer and less expressive than > if (str_startswith (str, p)) In shadow utils, I ended up using implicit conversions for if() conditionals: if (strchr(...)) if (!strchr(...)) and for while() loop conditions we do this: while (NULL != (p = strchr(...))) It's quite readable and relatively consistent. > * or introduces implicit conversions from pointer to bool > if (str_prefix_end (str, p)) > which some people abhor. On the one hand, yeah, I can agree. On the other hand, we have strchr(3), strpbrk(3), strstr(3), and all that family of functions. We don't have bool-returning variants of those. There's just a boolean- like pointer function, and users need implicit conversions. Since strprefix() and strsuffix() are essentially part of that same family, consistency is an important factor. One benefit of these dual-purpose functions is that you learn one function only, but have both funtionalities. If there were two variants, and the one returning an offset pointer would be stp*, then the similar names would let programmers just remember that stp* is the one with an offset pointer, and the other one is the basic one. But entirely different names are cognitively problematic. Here I agree with Paul, and prefer a single API. I am not sure if I would have preferred libc to have separated the strchr(3) and strstr(3) family of functions into bool-returning str* functions are offset-pointer-returning as stp* variants. But given the status quo, I'd slightly prefer following it. Projects that don't like <string.h> conventions can write their own wrappers that perform bool conversions. -- <https://www.alejandro-colomar.es> Use port 80 (that is, <...:80/>).
signature.asc
Description: PGP signature