Related git commits, starting with the first related change, and finishing with 
the fix required.
The intermediary commits provide some context used later to understand the 
required dependencies.
---

The initial commit changed __get_nprocs() to use sched_getaffinity()
instead of sysfs and procfs.

It now returns the number of processors that the process can run on, which can 
be restricted with
CPU affinity masks (i.e., sched_setaffinity()). Previously, it returned the 
number of processors
available in the system (without any restriction due to CPU affinity masks).

        commit 903bc7dcc2acafc40be11639767e10a2de712649
        Author: Adhemerval Zanella <[email protected]>
        Date:   Thu Mar 25 09:30:07 2021 -0300

            linux: Use sched_getaffinity for __get_nprocs (BZ #27645)

            <...> The initial scratch buffer <...>

But it used a buffer approach that turned out to be problematic, and was fixed 
in:
    
        commit eb68d7d23cc411acdf68a60f194343a6774d6194
        Author: Florian Weimer <[email protected]>
        Date:   Wed Jun 30 17:41:38 2021 +0200

            Linux: Avoid calling malloc indirectly from __get_nprocs
            <...>
            scratch buffers in __get_nprocs may result in infinite recursion.

Now, this commit is what changed malloc() to "formally" use that
implementation.

Note: that implementation was already being used, as the code used 
__get_nprocs()
which was already using sched_getaffinity() per the first commit above (and also
implied by the infinite recursion in the second commit above).

        commit 11a02b035b464ab6813676adfd19c4a59c36d907
        Author: Adhemerval Zanella <[email protected]>
        Date:   Mon Sep 6 12:22:54 2021 -0300

            misc: Add __get_nprocs_sched

            This is an internal function meant to return the number of avaliable
            processor where the process can scheduled, different than the
            __get_nprocs which returns a the system available online CPU.

            The Linux implementation currently only calls __get_nprocs(), which
            in tuns calls sched_getaffinity.
        <...>
        @@ -878,7 +878,7 @@ arena_get2 (size_t size, mstate avoid_arena)
        <...>
        -              int n = __get_nprocs ();
        +              int n = __get_nprocs_sched ();

The shortly-following commit _reverted_ the original behavior to __get_nprocs(),
(back to sysfs and procs) and _moved_ the new behavior for __get_nprocs_sched()
(based on sched_getaffinity()).

        commit 342298278eabc75baabcaced110a11a02c3d3580
        Author: Adhemerval Zanella <[email protected]>
        Date:   Mon Sep 6 14:19:51 2021 -0300

            linux: Revert the use of sched_getaffinity on get_nproc [sic: 
__get_nprocs] (BZ #28310)
            <...>
            
            The main issue using sched_getaffinity changed the symbols semantic
            from system-wide scope of online CPUs to per-process one (which can
            be changed with kernel cpusets or book [sic: boot] parameters in 
VM).

            This patch reverts mostly of the 903bc7dcc2acafc40, with the
        <...>
        -__get_nprocs (void)
        +__get_nprocs_sched (void)
        <...>
           int r = INTERNAL_SYSCALL_CALL (sched_getaffinity, 0, cpu_bits_size,
                                         cpu_bits);
        <...>
        -__get_nprocs_sched (void)
        +__get_nprocs (void)
        <...>
        +  int fd = __open_nocancel ("/sys/devices/system/cpu/online", flags);
        <...>
        +  fd = __open_nocancel ("/proc/stat", flags);
        <...>    

However, it did _not_ revert the change in arena_get2(), which was done
in:

        commit 472894d2cfee5751b44c0aaa71ed87df81c8e62e
        Author: Adhemerval Zanella <[email protected]>
        Date:   Wed Oct 11 13:43:56 2023 -0300

            malloc: Use __get_nprocs on arena_get2 (BZ 30945)

            This restore the 2.33 semantic for arena_get2.  It was changed by
            11a02b035b46 to avoid arena_get2 call malloc (back when __get_nproc
            was refactored to use an scratch_buffer - 903bc7dcc2acafc).  The
            __get_nproc was refactored over then and now it also avoid to call
            malloc.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2089789

Title:
  malloc performance degradation with CPU affinity masks

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/2089789/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to