Hi all, hi John & Thomas
John David Anglin wrote:
On 2024-02-29 6:02 p.m., Thomas Schwinge wrote:
I wonder: shouldn't that cap at 50 threads happen inside libgomp,
generally, instead of per test case and user code (!)?
Per my
understanding, OpenMP 'num_threads' specifies a *desired* number of
threads; the implementation may limit that value.
Sounds like a good suggestion.
I concur – if the hardware/OS doesn't support more.
* * *
However – for completeness and to correct a statement: While num_threads
specifies the desired number of threads, 'strict' will turn this into
error termination if the implementation cannot fulfilled the request.
Namely, "if prescriptiveness is specified as 'strict' and Algorithm 11.1
would result in a number of threads other than the value of the first
item of the _nthreads_ list then runtime error termination is performed."
Note that 'strict' for num_threads is new in/since the OpenMP 6.0 draft
(TR11, I think) and not yet implemented in GCC.
However, I guess that the thread limit also affects 'teams' and nested
parallelization. And for teams 'num_teams(n)' sets lower = upper value
to 'n' — Thus, this enforces this number of teams. (While
'num_teams(m:n)' sets both limits and 'omp_set_num_teams(n)' or
OMP_NUM_TEAMS=n only set the upper bound).
[As far as I can see, OpenACC always permits an implementation to use
fewer gangs/workers/vectors if the hardware doesn't support the
requested number.]
Tobias