On Tue, Jan 22, 2013 at 02:20:23PM +0100, Torvald Riegel wrote:
> > @@ -116,11 +116,10 @@ GOMP_task (void (*fn) (void *), void *data, void (
> >     }
> >        else
> >     fn (data);
> > -      if (team != NULL)
> > +      if (task.children != NULL)
> 
> Why does this not need an acquire barrier?

I explained that in my patch submission.  Briefly, the only way this
particular task.children can be set is if this thread's task work
function creates children.  So since the setter is *this* thread, we
need no barriers.  We can have task.children set by the current
thread then changed by a child thread, but seeing a stale non-NULL
value is not a problem here.  Once past the task_lock acquisition,
this thread will see the real value of task.children.

> >     {
> >       gomp_mutex_lock (&team->task_lock);
> > -     if (task.children != NULL)
> 
> Are you sure that you can remove this check to task.children here,
> especially if you don't use an acquire barrier previously?
> 
> Can there be several threads trying to call gomp_clear_parent?

Yes and yes.  gomp_clear_parent handles a NULL arg, and as you can
see, runs inside a task_lock mutex region here.

> > -       gomp_clear_parent (task.children);
> > +     gomp_clear_parent (task.children);
> >       gomp_mutex_unlock (&team->task_lock);
> 
> Please add comments or reference the comments in the other pieces of
> synchronizing code related to this.

Please note that all of the above is a reversion!  Adding a comment
has the risk that the comment is *wrong*.  That said, I appreciate the
need to document threaded code well..

> > -  if (task == NULL || team == NULL)
> > +  if (task == NULL
> > +      || __atomic_load_n (&task->children, MEMMODEL_ACQUIRE) == NULL)
> 
> Please mention with which stores this acquire load synchronizes.

How does this look?

        * task.c (GOMP_task, GOMP_taskwait): Comment.

Index: libgomp/task.c
===================================================================
--- libgomp/task.c      (revision 195370)
+++ libgomp/task.c      (working copy)
@@ -116,6 +116,15 @@
        }
       else
        fn (data);
+      /* Access to "children" is normally done inside a task_lock
+        mutex region, but the only way this particular task.children
+        can be set is if this thread's task work function (fn)
+        creates children.  So since the setter is *this* thread, we
+        need no barriers here when testing for non-NULL.  We can have
+        task.children set by the current thread then changed by a
+        child thread, but seeing a stale non-NULL value is not a
+        problem.  Once past the task_lock acquisition, this thread
+        will see the real value of task.children.  */
       if (task.children != NULL)
        {
          gomp_mutex_lock (&team->task_lock);
@@ -296,6 +305,12 @@
   struct gomp_task *child_task = NULL;
   struct gomp_task *to_free = NULL;
 
+  /* The acquire barrier on load of task->children here synchronizes
+     with the write of a NULL in gomp_barrier_handle_tasks.  It is
+     not necessary that we synchronize with other non-NULL writes at
+     this point, but we must ensure that all writes to memory by a
+     child thread task work function are seen before we exit from
+     GOMP_taskwait.  */
   if (task == NULL
       || __atomic_load_n (&task->children, MEMMODEL_ACQUIRE) == NULL)
     return;

-- 
Alan Modra
Australia Development Lab, IBM

Reply via email to