On Tue, Jan 22, 2013 at 02:20:23PM +0100, Torvald Riegel wrote: > > @@ -116,11 +116,10 @@ GOMP_task (void (*fn) (void *), void *data, void ( > > } > > else > > fn (data); > > - if (team != NULL) > > + if (task.children != NULL) > > Why does this not need an acquire barrier?
I explained that in my patch submission. Briefly, the only way this particular task.children can be set is if this thread's task work function creates children. So since the setter is *this* thread, we need no barriers. We can have task.children set by the current thread then changed by a child thread, but seeing a stale non-NULL value is not a problem here. Once past the task_lock acquisition, this thread will see the real value of task.children. > > { > > gomp_mutex_lock (&team->task_lock); > > - if (task.children != NULL) > > Are you sure that you can remove this check to task.children here, > especially if you don't use an acquire barrier previously? > > Can there be several threads trying to call gomp_clear_parent? Yes and yes. gomp_clear_parent handles a NULL arg, and as you can see, runs inside a task_lock mutex region here. > > - gomp_clear_parent (task.children); > > + gomp_clear_parent (task.children); > > gomp_mutex_unlock (&team->task_lock); > > Please add comments or reference the comments in the other pieces of > synchronizing code related to this. Please note that all of the above is a reversion! Adding a comment has the risk that the comment is *wrong*. That said, I appreciate the need to document threaded code well.. > > - if (task == NULL || team == NULL) > > + if (task == NULL > > + || __atomic_load_n (&task->children, MEMMODEL_ACQUIRE) == NULL) > > Please mention with which stores this acquire load synchronizes. How does this look? * task.c (GOMP_task, GOMP_taskwait): Comment. Index: libgomp/task.c =================================================================== --- libgomp/task.c (revision 195370) +++ libgomp/task.c (working copy) @@ -116,6 +116,15 @@ } else fn (data); + /* Access to "children" is normally done inside a task_lock + mutex region, but the only way this particular task.children + can be set is if this thread's task work function (fn) + creates children. So since the setter is *this* thread, we + need no barriers here when testing for non-NULL. We can have + task.children set by the current thread then changed by a + child thread, but seeing a stale non-NULL value is not a + problem. Once past the task_lock acquisition, this thread + will see the real value of task.children. */ if (task.children != NULL) { gomp_mutex_lock (&team->task_lock); @@ -296,6 +305,12 @@ struct gomp_task *child_task = NULL; struct gomp_task *to_free = NULL; + /* The acquire barrier on load of task->children here synchronizes + with the write of a NULL in gomp_barrier_handle_tasks. It is + not necessary that we synchronize with other non-NULL writes at + this point, but we must ensure that all writes to memory by a + child thread task work function are seen before we exit from + GOMP_taskwait. */ if (task == NULL || __atomic_load_n (&task->children, MEMMODEL_ACQUIRE) == NULL) return; -- Alan Modra Australia Development Lab, IBM