Hi! I've noticed that occassionally on larger box under high load cancel-taskgroup-{2.c,2.C,3.C} tests would time out. The problem is that sometimes gomp_team_barrier_clear_task_pending wasn't called, when GOMP_taskgroup_end (or could be GOMP_taskwait) ate the last few threads. We have task_count counter (number of GOMP_TASK_{WAITING,TIED} tasks that haven't finished yet) and then task_running_count, which is only incremented in gomp_barrier_handle_tasks and decremented there too, and we use those for two purposes: 1) to find out if it makes sense to wake some threads possibly sleeping on a barrier 2) to find out if it is ok to call gomp_team_barrier_clear_task_pending Pre-gomp-4_0-branch, we were actually incrementing task_running_count in both gomp_barrier_handle_tasks and GOMP_taskwait (GOMP_taskgroup_end didn't exist yet), and thus task_running_count was actually usable for 2), but could be way off for 1) - task_running_count could be much bigger than nthreads, even when say all the running tasks were on one or two threads, all caused by GOMP_taskwait inside of task spawned from GOMP_taskwait, etc. I've changed it to only increment task_running_count in gomp_barrier_handle_tasks, so it is actually guaranteed to be <= nthreads and can be reliably used for 1), but that apparently broke the spots using it for 2). So, this patch introduces another counter, number of GOMP_TASK_WAITING tasks ready to be scheduled, but not scheduled yet, and we simply clear pending tasks bits when that counter decrements to zero.
Tested on x86_64-linux and i686-linux. 2013-10-08 Jakub Jelinek <ja...@redhat.com> * libgomp.h (struct gomp_team): Add task_queued_count field. Add comments about task_{,queued_,running_}count. * team.c (gomp_new_team): Clear task_queued_count. * task.c (GOMP_task): Increment task_queued_count. (gomp_task_run_pre): Decrement task_queued_count. If it is decremented to zero, call gomp_team_barrier_clear_task_pending. (gomp_task_run_post_handle_dependers): Increment task_queued_count. (gomp_barrier_handle_tasks): Don't call gomp_team_barrier_clear_task_pending here. --- libgomp/libgomp.h.jj 2013-10-04 13:48:39.000000000 +0200 +++ libgomp/libgomp.h 2013-10-08 19:17:12.303460999 +0200 @@ -385,8 +385,18 @@ struct gomp_team gomp_mutex_t task_lock; struct gomp_task *task_queue; - int task_count; - int task_running_count; + /* Number of all GOMP_TASK_{WAITING,TIED} tasks in the team. */ + unsigned int task_count; + /* Number of GOMP_TASK_WAITING tasks currently waiting to be scheduled. */ + unsigned int task_queued_count; + /* Number of GOMP_TASK_{WAITING,TIED} tasks currently running + directly in gomp_barrier_handle_tasks; tasks spawned + from e.g. GOMP_taskwait or GOMP_taskgroup_end don't count, even when + that is called from a task run from gomp_barrier_handle_tasks. + task_running_count should be always <= team->nthreads, + and if current task isn't in_tied_task, then it will be + even < team->nthreads. */ + unsigned int task_running_count; int work_share_cancelled; int team_cancelled; --- libgomp/team.c.jj 2013-10-04 18:44:31.000000000 +0200 +++ libgomp/team.c 2013-10-08 19:03:25.067640468 +0200 @@ -172,6 +172,7 @@ gomp_new_team (unsigned nthreads) gomp_mutex_init (&team->task_lock); team->task_queue = NULL; team->task_count = 0; + team->task_queued_count = 0; team->task_running_count = 0; team->work_share_cancelled = 0; team->team_cancelled = 0; --- libgomp/task.c.jj 2013-09-27 12:04:12.000000000 +0200 +++ libgomp/task.c 2013-10-08 19:50:46.057284830 +0200 @@ -407,6 +407,7 @@ GOMP_task (void (*fn) (void *), void *da team->task_queue = task; } ++team->task_count; + ++team->task_queued_count; gomp_team_barrier_set_task_pending (&team->barrier); do_wake = team->task_running_count + !parent->in_tied_task < team->nthreads; @@ -434,6 +435,8 @@ gomp_task_run_pre (struct gomp_task *chi team->task_queue = NULL; } child_task->kind = GOMP_TASK_TIED; + if (--team->task_queued_count == 0) + gomp_team_barrier_clear_task_pending (&team->barrier); if ((gomp_team_barrier_cancelled (&team->barrier) || (taskgroup && taskgroup->cancelled)) && !child_task->copy_ctors_done) @@ -538,6 +541,7 @@ gomp_task_run_post_handle_dependers (str team->task_queue = task; } ++team->task_count; + ++team->task_queued_count; ++ret; } free (child_task->dependers); @@ -670,8 +674,6 @@ gomp_barrier_handle_tasks (gomp_barrier_ } team->task_running_count++; child_task->in_tied_task = true; - if (team->task_count == team->task_running_count) - gomp_team_barrier_clear_task_pending (&team->barrier); } gomp_mutex_unlock (&team->task_lock); if (do_wake) Jakub