Enclosed few thoughts to the subject:

- since it is very rare situation that one needs only a memcpy without to know whether previous alloc may fail (e. g. some of pointers were NULL), me too thinks that the caller should be responsible for the check.
  So I would not extend ngx_memcpy or ngx_cpymem in that way.

- rewrite of `ngx_memcpy` define like here:
  ```
+ #define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) : memcpy(dst, src, n))
  ```
may introduce a regression or compat issues, e. g. fully functioning codes like that may become broken hereafter:
  ```
ngx_memcpy(dst, src, ++len); // because n would be incremented twice in the macro now
  ```
Sure, `ngx_cpymem` has also the same issue, but it had that already before the "fix"... Anyway, I'm always against of such macros (no matter with or without check it would be better as an inline function instead).

My conclusion:
a fix of affected places invoking `ngx_memcpy` and `ngx_cpymem`, and possibly an assert in `ngx_memcpy`
  and `ngx_cpymem` would be fully enough, in my opinion.

Regards,
Sergey.

On 15.12.2023 03:41, Maxim Dounin wrote:

Hello!

On Wed, Dec 13, 2023 at 11:09:28AM -0500, Ben Kallus wrote:

Nginx executes numerous `memcpy`s from NULL during normal execution. `memcpy`ing to or from NULL is undefined behavior. Accordingly, some compilers (gcc -O2) make optimizations that assume `memcpy` arguments
    are not NULL. Nginx with UBSan crashes during startup due to this
    issue.

    Consider the following function:
    ```C
    #include <string.h>

    int f(int i) {
        char a[] = {'a'};
        void *src = i ? a : NULL;
        char dst[1];
        memcpy(dst, src, 0);
        return src == NULL;
    }
    ```
    Here's what gcc13.2 -O2 -fno-builtin will do to it:
    ```asm
    f:
            sub     rsp, 24
            xor     eax, eax
            test    edi, edi
            lea     rsi, [rsp+14]
            lea     rdi, [rsp+15]
            mov     BYTE PTR [rsp+14], 97
            cmove   rsi, rax
            xor     edx, edx
            call    memcpy
            xor     eax, eax
            add     rsp, 24
            ret
    ```
    Note that `f` always returns 0, regardless of the value of `i`.

Feel free to try for yourself at https://gcc.godbolt.org/z/zfvnMMsds

The reasoning here is that since memcpy from NULL is UB, the optimizer is free to assume that `src` is non-null. You might consider this to be a problem with the compiler, or the C standard, and I might agree.
    Regardless, relying on UB is inherently un-portable, and requires
maintenance to ensure that new compiler releases don't break existing
    assumptions about the behavior of undefined operations.

The following patch adds a check to `ngx_memcpy` and `ngx_cpymem` that makes 0-length memcpy explicitly a noop. Since all memcpying from NULL
    in Nginx uses n==0, this should be sufficient to avoid UB.

    It would be more efficient to instead add a check to every call to
    ngx_memcpy and ngx_cpymem that might be used with src==NULL, but in
the discussion of a previous patch that proposed such a change, a more
    straightforward and tidy solution was desired.
    It may also be worth considering adding checks for NULL memset,
memmove, etc. I think this is not necessary unless it is demonstrated
    that Nginx actually executes such undefined calls.

    # HG changeset patch
    # User Ben Kallus <benjamin.p.kallus...@dartmouth.edu>
    # Date 1702406466 18000
    #      Tue Dec 12 13:41:06 2023 -0500
    # Node ID d270203d4ecf77cc14a2652c727e236afc659f4a
    # Parent  a6f79f044de58b594563ac03139cd5e2e6a81bdb
    Add NULL check to ngx_memcpy and ngx_cpymem to satisfy UBSan.

    diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.c
    --- a/src/core/ngx_string.c     Wed Nov 29 10:58:21 2023 +0400
    +++ b/src/core/ngx_string.c     Tue Dec 12 13:41:06 2023 -0500
    @@ -2098,6 +2098,10 @@
             ngx_debug_point();
         }

    +    if (n == 0) {
    +        return dst;
    +    }
    +
         return memcpy(dst, src, n);
     }

    diff -r a6f79f044de5 -r d270203d4ecf src/core/ngx_string.h
    --- a/src/core/ngx_string.h     Wed Nov 29 10:58:21 2023 +0400
    +++ b/src/core/ngx_string.h     Tue Dec 12 13:41:06 2023 -0500
    @@ -103,8 +103,9 @@
      * gcc3 compiles memcpy(d, s, 4) to the inline "mov"es.
* icc8 compile memcpy(d, s, 4) to the inline "mov"es or XMM moves.
      */
    -#define ngx_memcpy(dst, src, n)   (void) memcpy(dst, src, n)
-#define ngx_cpymem(dst, src, n) (((u_char *) memcpy(dst, src, n)) + (n)) +#define ngx_memcpy(dst, src, n) (void) ((n) == 0 ? (dst) : memcpy(dst, src, n)) +#define ngx_cpymem(dst, src, n) \
    +    ((u_char *) ((n) == 0 ? (dst) : memcpy(dst, src, n)) + (n))

     #endif

    diff -r a6f79f044de5 -r d270203d4ecf src/http/v2/ngx_http_v2.c
    --- a/src/http/v2/ngx_http_v2.c Wed Nov 29 10:58:21 2023 +0400
    +++ b/src/http/v2/ngx_http_v2.c Tue Dec 12 13:41:06 2023 -0500
    @@ -3998,9 +3998,7 @@
                     n = size;
                 }

    -            if (n > 0) {
    -                rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);
    -            }
    +            rb->buf->last = ngx_cpymem(rb->buf->last, pos, n);

                 ngx_log_debug1(NGX_LOG_DEBUG_HTTP, fc->log, 0,
                                "http2 request body recv %uz", n);


For the record, I've already provided some feedback to Ben in the
ticket here:

https://trac.nginx.org/nginx/ticket/2570

And pointed to the existing thread here:

https://mailman.nginx.org/pipermail/nginx-devel/2023-October/PX7VH5A273NLUGSYC7DR2AZRU75CIQ3Q.html
https://mailman.nginx.org/pipermail/nginx-devel/2023-December/DCGUEGEFS6TSVIWNEWUEZO3FZMR6ESYZ.html

Hope this helps.

--
Maxim Dounin
http://mdounin.ru/
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel

_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel

Reply via email to