On October 9, 2025 3:37:19 PM PDT, Alejandro Colomar <[email protected]> wrote:
>Hi!
>
>The Austin Group (POSIX) seems to be in favour of this proposal.
>They're waiting for the C Committee to also accept it, for copying the
>specific wording, but they seem in favor of the proposal regardless, and
>as Eric said some time ago, POSIX could follow through even if wg14
>keeps it as undefined behavior.
>
><https://www.austingroupbugs.net/view.php?id=1949#c7286>
>
>       The Austin Group discussed this on 9 Oct 2025, and is in general
>       in favor of tightening the requirements on allocations of size 0
>       for Issue 9, to eliminate EINVAL for an unsupported size 0.
>       However, as Issue 9 will likely depend on C2Y, we would prefer
>       to delay wordsmithing and determination of which portions of the
>       text may still need <CX> shading until after C2Y has settled on
>       their parallel project of improving the specifications of
>       allocation behavior on a size of 0.
>
>       With glibc, malloc() and calloc() would already be in
>       compliance, realloc() would need to change behavior to match the
>       suggested wording.
>
>       Most BSD implementations would already be in compliance.
>
>So, please fix glibc already.  We seem to have POSIX support (and at
>least partial wg14 support).
>
>
>Have a lovely night!
>Alex
>
>---
>Name
>       alx-0029r8 - Restore the traditional realloc(3) specification
>
>Principles
>       -  Uphold the character of the language
>       -  Keep the language small and simple
>       -  Facilitate portability
>       -  Avoid ambiguities
>       -  Pay attention to performance
>       -  Codify existing practice to address evident deficiencies.
>       -  Do not prefer any implementation over others
>       -  Ease migration to newer language editions
>       -  Avoid quiet changes
>       -  Enable secure programming
>
>Category
>       Remove UB.
>
>Author
>       Alejandro Colomar <[email protected]>
>
>       Cc: <[email protected]>
>       Cc: <[email protected]>
>       Cc: <[email protected]>
>       Cc: наб <[email protected]>
>       Cc: Douglas McIlroy <[email protected]>
>       Cc: Paul Eggert <[email protected]>
>       Cc: Robert Seacord <[email protected]>
>       Cc: Elliott Hughes <[email protected]>
>       Cc: Bruno Haible <[email protected]>
>       Cc: JeanHeyd Meneide <[email protected]>
>       Cc: Rich Felker <[email protected]>
>       Cc: Adhemerval Zanella Netto <[email protected]>
>       Cc: Joseph Myers <[email protected]>
>       Cc: Florian Weimer <[email protected]>
>       Cc: Andreas Schwab <[email protected]>
>       Cc: Thorsten Glaser <[email protected]>
>       Cc: Eric Blake <[email protected]>
>       Cc: Vincent Lefevre <[email protected]>
>       Cc: Mark Harris <[email protected]>
>       Cc: Collin Funk <[email protected]>
>       Cc: Wilco Dijkstra <[email protected]>
>       Cc: DJ Delorie <[email protected]>
>       Cc: Cristian Rodríguez <[email protected]>
>       Cc: Siddhesh Poyarekar <[email protected]>
>       Cc: Sam James <[email protected]>
>       Cc: Mark Wielaard <[email protected]>
>       Cc: "Maciej W. Rozycki" <[email protected]>
>       Cc: Martin Uecker <[email protected]>
>       Cc: Christopher Bazley <[email protected]>
>       Cc: <[email protected]>
>       Cc: Daniel Krügler <[email protected]>
>       Cc: Kees Cook <[email protected]>
>       Cc: Valdis Klētnieks <[email protected]>
>
>History
>       <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0029.git/>
>
>       r0 (2025-06-17):
>       -  Initial draft.
>
>       r1 (2025-06-20):
>       -  Full rewrite after the recent glibc discussion.
>
>       r2 (2025-06-21):
>       -  Remove CC.  Add CC.
>       -  wfix.
>       -  Drop quote.
>       -  Add a few more principles
>       -  Clarify why ENOMEM is used in this proposal, and make it
>          optional.
>       -  Mention exceptional leak in code checking (size != 0).
>       -  Clarify that part of the description of realloc can be
>          editorially removed after this change.
>
>       r3 (2025-06-23):
>       -  Fix diff missing line.
>       -  Remove ENOMEM from the proposal.
>       -  Clarify that ENOMEM should be retained by platforms already
>          using it.
>       -  Add mention that LLVM's address sanitizer will catch the leak
>          mentioned in r2.
>       -  Add links to real bugs (including an RCE bug).
>
>       r4 (2025-06-24):
>       -  Use a better link for the Whatsapp RCE.
>       -  s/Description/Rationale/
>       -  wfix
>       -  Mention that glibc <2.1.1 had the BSD behavior.
>       -  Add footnote that realloc(3) may fail while shrinking.
>
>       r5 (2025-06-26):
>       -  It was glibc 2.1.1 that broke it, not glibc 2.2.
>       -  wfix
>       -  Mention in the footnote that the pointer may change.
>       -  Document why not go the other way around.  It was explained
>          several times during discussion, but people keep suggesting
>          it.
>
>       r6 (2025-06-27; n3621):
>       -  Clarify that the paragraph about what happens when the size
>          is zero refers to when the total size is zero (for calloc(3)
>          that is nmemb*size).
>       -  s/Unix V7/V7 Unix/
>       -  tfix.
>       -  wfix.
>
>       Brno meeting (2025-08-27):
>       -  9/13/6
>       -  Along the lines: 21/1/5
>       -  People recognized in the dinner after the meeting, and in the
>          reflector, and in corridor discussions, that they hadn't
>          understood the paper, and that it was more well thought than
>          they initially thought.  They would change their vote to be
>          in favour with this proposal.
>
>       r7 (2025-09-21):
>       -  Add link.
>
>       r8 (2025-10-09):
>       -  POSIX wants this change.
>
>See also
>       <https://nabijaczleweli.xyz/content/blogn_t/017-malloc0.html>
>       <https://sourceware.org/pipermail/libc-alpha/1999-April/000956.html>
>       
> <https://inbox.sourceware.org/libc-alpha/nbyurzcgzgd5rdybbi4no2kw5grrc32k63svf7oq73nfcbus5r@77gry66kpqfr/>
>       
> <https://inbox.sourceware.org/libc-alpha/[email protected]/T/#u>
>       
> <https://inbox.sourceware.org/libc-alpha/qukfe5yxycbl5v7ooskvqdnm3au3orohbx4babfltegi47iyly@or6dgf7akeqv/T/#u>
>       
> <https://github.com/bminor/glibc/commit/7c2b945e1fd64e0a5a4dbd6ae6592a7314dcd4b5>
>       <https://github.com/llvm/llvm-project/issues/113065>
>       <https://www.austingroupbugs.net/view.php?id=400>
>       <https://www.austingroupbugs.net/view.php?id=526>
>       <https://www.austingroupbugs.net/view.php?id=688>
>       <https://sourceware.org/bugzilla/show_bug.cgi?id=12547>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_400.htm>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n868.htm>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2438.htm>
>       <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf>
>       
> <https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/realloc.html>
>       
> <https://pubs.opengroup.org/onlinepubs/9699919799.2013edition/functions/realloc.html>
>       <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120744>
>       
> <https://lore.kernel.org/lkml/[email protected]/>
>       <https://awakened1712.github.io/hacking/hacking-whatsapp-gif-rce/>
>       <https://gbhackers.com/whatsapp-double-free-vulnerability/>
>       <https://www.austingroupbugs.net/view.php?id=1949>
>
>Rationale
>       The specification of realloc(3) has been problematic since the
>       very first standards, even before ISO C.  The wording has
>       changed significantly, trying to forcedly permit implementations
>       to return a null pointer when the requested size is zero.  This
>       originated from the intent of banning zero-sized objects from
>       the language in C89, but that never worked well in
>       retrospective, as we can see from the fallout.
>
>       None of the specifications have been good, and C23 finally gave
>       up and made it undefined behavior.
>
>       The problem is not only theoretical.  Programmers don't know how
>       to use realloc(3) correctly, and have written weird code in
>       their attempts.  This has resulted in a lot of non-sensical code
>       in configure scripts[1], and even bugs in actual programs[2].
>
>       [1] 
> <https://codesearch.debian.net/search?q=%5Cbrealloc%5B+%5Ct%5D*%5B%28%5D%5B%5E%2C%5D*%2C%5B+%5Ct%5D0%5B%29%5D&literal=0>
>       [2] 
> <https://lore.kernel.org/lkml/[email protected]/>
>
>       In some cases, this non-sensical code has resulted in RCEs[3].
>
>       [3] <https://awakened1712.github.io/hacking/hacking-whatsapp-gif-rce/>
>
>       However, this doesn't need to be like that.  The traditional
>       implementation of realloc(3), present in V7 Unix, inherited by
>       the BSDs, and currently available in a range of systems,
>       including musl libc, doesn't have any issues regarding zero-size
>       allocations.  glibc --which uses an independent implementation
>       rather than a Unix derivative-- also had this behavior
>       originally; it changed to the current behavior in 1999
>       (glibc 2.1.1), only for compatibility with C89, even though
>       ironically C99 was released soon after and removed the text that
>       glibc was trying to comply with, and introduced some new text
>       that was very confusing, and one of its interpretations would
>       make the new glibc behavior non-conforming.
>
>       Code written for platforms returning a null pointer can be
>       migrated to platforms returning non-null, without significant
>       issues.
>
>       There are two kinds of code that call realloc(p,0).  One
>       hard-codes the 0, and is used as a replacement of free(p).  This
>       code ignores the return value, since it's unimportant.  This
>       code currently produces a leak of 0 bytes plus associated
>       metadata on platforms such as musl libc, where it returns a
>       non-null pointer.  However, assuming that there are programs
>       written with the knowledge that they won't ever be run on such
>       platforms, we should take care of that, and make sure they don't
>       leak.  A way of accomplishing this would be to recommend
>       implementations to issue a diagnostic when realloc(3) is called
>       with a hardcoded zero.  This is only an informal recommendation
>       made by this proposal, as this is a matter of QoI, and the
>       standard shouldn't say anything about it.  This would prevent
>       this class of minor leaks.
>
>       Moreover, in glibc, realloc(p,0) may return non-null, in the
>       case where p is NULL, so code must already take that into
>       account, and thus code that simply takes realloc(p,0) as a
>       synonym of free(p) is already leaky, as free(NULL) is a no-op,
>       but realloc(NULL,0) allocates 0 bytes.
>
>       The other kind of code is in algorithms that realloc(3) an
>       arbitrary size, which might eventually be zero.  This gets more
>       complex.
>
>       Here's the code that should be written for AIX or glibc:
>
>               errno = 0;
>               new = realloc(old, size);
>               if (new == NULL) {
>                       if (errno == ENOMEM)
>                               free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
>
>       Failing to check for ENOMEM in these platforms before freeing
>       the old pointer would result in a double-free.  If the program
>       decides to continue using the old pointer instead of freeing it,
>       it would result in a use-after-free.
>
>       In the platforms where realloc(p,0) returns non-null, such as
>       the BSDs or musl libc, it is simpler to handle it:
>
>               new = realloc(old, size);
>               if (new == NULL) {  // errno is ENOMEM
>                       free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
>
>       Whenever the result is a null pointer, these platforms are
>       reporting an ENOMEM error, and thus it is superfluous to check
>       errno there.
>
>       Most code is written in this way, even if run on platforms
>       returning a null pointer.  This is because most programmers are
>       just unaware of this problem.  Part of the reason is also that
>       returning a non-null pointer with zero bytes is the natural
>       extension of the behavior, which is what programmers intuitively
>       expect from libc; that is, if realloc(p,3) allocates 3 bytes,
>       r(p,2) allocates two bytes, and r(p,1) allocates one byte, it is
>       natural by induction to expect that r(p,0) will allocate zero
>       bytes.  Most algorithms naturally extend to 0 just fine, and
>       special casing 0 is artificial.
>
>       If the realloc(3) specification were changed to require that
>       realloc(p,0) returns non-null on success, and that realloc(p,0)
>       only fails when out-of-memory (and assuming the implementations
>       will continue setting errno to ENOMEM), then code written for
>       AIX or glibc would continue working just fine, since the errno
>       check would be redundant with the null check.  Simply, the
>       conditional (errno == ENOMEM) would always be true when
>       (new == NULL).
>
>       Then, there are non-POSIX platforms that don't set ENOMEM.  In
>       those platforms, code might do this:
>
>               new = realloc(old, size);
>               if (new == NULL) {
>                       if (size != 0)
>                               free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
>
>       That code would continue working with this proposal, except for
>       a very rare corner case, in which it would leak.  In the normal
>       case, (size != 0) would never be true under (new == NULL),
>       because a reallocation of 0 bytes would almost always succeed,
>       and thus not return a null pointer under this proposal.
>       However, in some cases, the system might not find space even for
>       the small metadata needed for a 0-byte allocation.  In such
>       case, the (size != 0) conditional would prevent deallocating
>       'old', and thus cause a memory leak.  This case is exceptional
>       enough that it shouldn't stop us from fixing realloc(3).
>       Anyway, on an out-of-memory case, the program is likely to
>       terminate rather soon, so the issue is even less likely to have
>       an impact on any existing programs.  Also, LLVM's address
>       sanitizer will soon able to catch such a leak:
>       <https://github.com/llvm/llvm-project/issues/113065>
>
>       This proposal makes handling of realloc(3) as straightforward as
>       one would expect, with only two states: success or error.  There
>       are no in-between states.
>
>       The resulting wording in the standard is also much simpler, as
>       it doesn't need to define so many special cases.
>
>       For consistency, all the other allocation functions are updated
>       to both return a null pointer on error, and use consistent
>       wording.
>
>    Why not go the other way around?
>       Some people keep asking why not go the other way around: why not
>       force the BSDs and musl to return a null pointer if size is 0.
>       This would result in double-free and use-after-free bugs, which
>       can result in RCE vulnerabilities (remote code execution), which
>       is clearly unacceptable.
>
>       Consider this code, which is the usual code for calling
>       realloc(3) in such systems:
>
>               new = realloc(old, size);
>               if (new == NULL) {
>                       free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
>
>       If realloc(p,0) would return a null pointer and free the old
>       block, then the third line would be a double-free bug.
>
>    POSIX
>       POSIX is in favour of this proposal, and is waiting for the
>       C Committee to accept it, to copy the wording from C2y.
>       <https://www.austingroupbugs.net/view.php?id=1949#c7286>
>
>Prior art
>    gnulib
>       gnulib provides the realloc-posix module, which aims to wrap the
>       system realloc(3) and reallocarray(3) functions so that they
>       behave in a POSIX-complying manner.
>
>       It previously behaved like glibc.  After I reported that it was
>       non-conforming to POSIX, we discussed the best way forward,
>       which we agreed was the same direction that this paper is
>       proposing now for C2y.  The implementation was changed in
>
>               gnulib.git d884e6fc4a60 (2024-11-04; "realloc-posix: realloc 
> (..., 0) now returns nonnull")
>
>       There have been no regression reports since then, as we
>       expected.
>
>    V7 Unix, BSD
>       The proposed behavior is the one endorsed by Doug McIlroy, the
>       author of the original implementation of realloc(3) in V7 Unix,
>       and also present in the BSDs.
>
>    glibc <= 2.1
>       glibc was implemented originally to return non-null.  It was
>       only in 1999, and purely to comply with the standards --with no
>       requests by users to do so--, that the glibc maintainers decided
>       to switch to the current behavior.
>
>Design decisions
>       This change needs two changes, which can be applied all at once,
>       or in separate steps.
>
>       The first step would make realloc(p,s) be consistent with
>       free(p) and malloc(s), including when p is a null pointer, when
>       s is zero, and also when both corner cases happen at the same
>       time.  This change would already turn the implementations where
>       malloc(0) returns non-null into the end goal we have.  This
>       would require changes to (at least) the following
>       implementations: glibc, Bionic, Windows.
>
>       The second step would be to require that malloc(0) returns a
>       non-null pointer.  This would require changes to (at least) the
>       following implementations: AIX.
>
>       This proposal has merged all steps into a single proposal.
>
>Future directions
>       This proposal, by specifying realloc(3) as-if by calling
>       free(3) and malloc(3), makes redundant several mentions of
>       realloc(3) next to either free(3) or malloc(3) in the standard.
>       We could remove them in this proposal, or clean up that in a
>       separate (mostly editorial) proposal.  Let's keep it for a
>       future proposal for now.
>
>Caveats
>    n?n:1
>       Code written today should be careful, in case it can run on
>       older systems that are not fixed to comply with this stricter
>       specification.  Thus, code written today should call realloc(3)
>       similar to this:
>
>               realloc(p, n?n:1);
>
>       When all existing implementations are fixed to comply with this
>       stricter specification, that workaround can be removed.
>
>    ENOMEM
>       Existing implementations that set errno to ENOMEM must continue
>       doing so when the input pointer is not freed.  If they didn't,
>       code that is currently portable to all POSIX systems
>
>               errno = 0;
>               new = realloc(old, size);
>               if (new == NULL) {
>                       if (errno == ENOMEM)
>                               free(old);
>                       goto fail;
>               }
>               ...
>               free(new);
>
>       would leak on error.
>
>       Since it is currently impossible to write code today that is
>       portable to arbitrary C17 systems, this is not an issue in
>       ISO C.
>
>               -  New code written for C2y will only need to check for
>                  NULL to detect errors.
>
>               -  Code written for specific C17 and older platforms
>                  that don't set errno will continue to work for those
>                  specific platforms.
>
>               -  Code written for POSIX.1-2024 and older platforms
>                  will continue working on POSIX C2y platforms,
>                  assuming that POSIX will continue mandating ENOMEM.
>
>               -  Code written for POSIX.1-2024 and older will not be
>                  able to be run on non-POSIX C2y platforms, but that
>                  could be expected.
>
>       The only important thing is that platforms that did set ENOMEM
>       should continue setting it, to avoid introducing leaks.
>
>Proposed wording
>       Based on N3550.
>
>    7.25.4.1  Memory management functions :: General
>       @@ p1
>       ...
>       -If the size of the space requested is zero,
>       +If the total size of the space requested is zero,
>       -the behavior is implementation-defined:
>       -either
>       -a null pointer is returned to indicate the error,
>       -or
>        the behavior is as if the size were some nonzero value,
>        except that the returned pointer shall not be used
>        to access an object.
>
>    7.25.4.2  The aligned_alloc function
>       @@ Returns, p3
>        The <b>aligned_alloc</b> function returns
>       -either
>       -a null pointer
>       -or
>       -a pointer to the allocated space.
>       +a pointer to the allocated space
>       +on success.
>       +If
>       +the space cannot be allocated,
>       +a null pointer is returned.
>
>    7.25.4.3  The calloc function
>       @@ Returns, p3
>        The <b>calloc</b> function returns
>       -either
>        a pointer to the allocated space
>       +on success.
>       -or a null pointer
>       -if
>       +If
>        the space cannot be allocated
>        or if the product <tt>nmemb * size</tt>
>       -would wraparound <b>size_t</b>.
>       +would wraparound <b>size_t</b>,
>       +a null pointer is returned.
>
>    7.25.4.7  The malloc function
>       @@ Returns, p3
>        The <b>malloc</b> function returns
>       -either
>       -a null pointer
>       -or
>       -a pointer to the allocated space.
>       +a pointer to the allocated space
>       +on success.
>       +If
>       +the space cannot be allocated,
>       +a null pointer is returned.
>
>    7.25.4.8  The realloc function
>       @@ Description, p2
>        The <b>realloc</b> function
>        deallocates the old object pointed to by <tt>ptr</tt>
>       +as if by a call to <b>free</b>,
>        and returns a pointer to a new object
>       -that has the size specified by <tt>size</tt>.
>       +that has the size specified by <tt>size</tt>
>       +as if by a call to <b>malloc</b>.
>        The contents of the new object
>        shall be the same as that of the old object prior to deallocation,
>        up to the lesser of the new and old sizes.
>        Any bytes in the new object
>        beyond the size of the old object
>        have unspecified values.
>
>       @@ p3
>        If <tt>ptr</tt> is a null pointer,
>        the <b>realloc</b> function behaves
>        like the <b>malloc</b> function for the specified size.
>        Otherwise,
>        if <tt>ptr</tt> does not match a pointer
>        earlier returned by a memory management function,
>        or
>        if the space has been deallocated
>        by a call to the <b>free</b> or <b>realloc</b> function,
>       ## We can probably remove all of the above, because of the
>       ## behavior now being defined as-if by calls to malloc(3) and
>       ## free(3).  But let's do that editorially in a separate change.
>       -or
>       -if the size is zero,
>       ## We're defining the behavior.
>        the behavior is undefined.
>        If
>       -memory for the new object is not allocated,
>       +the space cannot be allocated,
>       ## Editorial; for consistency with the wording of the other functions.
>        the old object is not deallocated
>        and its value is unchanged.
>       +XXX)
>
>       @@ New footnote XXX
>       +XXX)
>       +While atypical,
>       +<b>realloc</b> may fail
>       +or return a different pointer
>       +for a call that shrinks the block of memory.
>
>       @@ Returns, p4
>        The <b>realloc</b> function returns
>        a pointer to the new object
>        (which can have the same value
>       -as a pointer to the old object),
>       +as a pointer to the old object)
>       +on success.
>       -or
>       +If
>       +space cannot be allocated,
>        a null pointer
>       -if the new object has not been allocated.
>       +is returned.
>

Makes sense to me.

Reply via email to