[Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213 Thanassis Tsiodras changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #4 from Thanassis Tsiodras --- Marking as resolved.
[Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213 --- Comment #3 from Jakub Jelinek --- Note, I think significant speedup is in tail recursion optimization which will be prevented even with mergeable task. Computing fibonacci this way is not efficient.
[Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213 --- Comment #2 from Thanassis Tsiodras --- I see. I was not aware of "mergeable", TBH - thanks for pointing it out (it led me to reading about "data environments"). Thanks, Jakub.
[Bug libgomp/97213] OpenMP "if" is dramatically slower than code-level "if" - why?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97213 --- Comment #1 from Jakub Jelinek --- Even with if(false) the implementation has to create a new data environment etc. if(false) just means the task will be included, i.e. the generating task will only continue when the included task finishes and the generating thread will execute the task. You'd need to add mergeable clause also to let the implementation for if(false) pretend there wasn't the task directive at all, but that is just an optimization option that GCC doesn't use right now (would require basically copying the region once again). Also, there is the overhead of the taskwait that you perform unconditionally at all levels.