https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115367

--- Comment #4 from Niklas Hambüchen <mail+gcc at nh2 dot me> ---
I think this 15-minute load average behaviour is an extremely bad idea.

As PierU confirmed, it results in severe underutilisation of CPU.

I introduces random misperforming behaviour in all sorts of programs.

I rediscover this every couple years in some new programs and libraries.

2 years ago, I found it in OpenCV, finding that it sometimes doesn't use all
the cores:

https://github.com/opencv/opencv/issues/25717

Now, 2 years later, I write some Python `numba` code, which also uses
multithreaded OpenMP, and lost hours of debugging to figure out why it often
spawns 5 threads instead of 6 on my 6-core machines; eventually tracing it down
to only happen when I `import cv` (again triggering the `omp_set_dynamic()` and
thus this behaviour).

This looking at the 15-minute load average makes programs behave completely
unpredictable/ununderstandable (and with worse performance than simply NOT
doing any of this!), and each time it consumes hours of investigation time just
to end up here again.

The fact that these 15 minutes are involved makes debugging extremely lengthy
and unpleasant.

I don't have a good suggestion given how the OpenMP spec was written ("setting
the initial value of the dyn-var ICV" seems to suggest that OpenMP
implementations must use this value for the entire loop and cannot adjust it
e.g. according to the current load-average _during_ the loop, which would make
more sense), but it's definitely very frustrating.

Maybe the best way is to just to heavily advise against using this feature in
the docs, as it will make users' and developers' life a pain.

Luckily OpenCV removed the use of this feature for the next release.

Maybe the "With such a method, on average the application will use only half of
the cores" can still be improved in some way, but overall the whole feature
seems to be asking for trouble.

Reply via email to