George, I believe that the 64-bit atomics support detection issue that you are suspecting is already covered by Nathan's PR1659 on master, and will be PRed/CMRed for v2.x after that is merged.
Regardless of that, however, since the field is declared as "volatile int32_t" the opal_list code does need to be fixed to use 32-bit operations unconditionally, right? -Paul On Tue, May 10, 2016 at 12:36 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > Paul, > > I think the ref_count should always be manipulated with atomic operations, > otherwise we can't use them for internal, thread-safe, purposes. That being > said the issue at hand seems a little different. The difference in the > generated code between the opal_atomic_add and the OPAL_THREAD_ADD32, is > that the macro is explicitly calling opal_atomic_add32, while the generic > atomic_add has a switch inside (to select between atomics operations on > different type). For the error you mention to happen our configure script > must have detected that there is support for 8bytes atomic operations, thus > setting OPAL_HAVE_ATOMIC_ADD_64 to 1. > > Can you take a look at the 64 bits atomic detection in the config.log and > post here the corresponding output ? > > Thanks, > George. > > > > On Tue, May 10, 2016 at 1:38 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > >> I am currently working with the v2.x branch, rather than tarballs. >> >> While attempting to build on AIX (which is ILP32 by default), I >> encountered an unexpected undefined reference to __sync_add_and_fetch_8() >> from opal/class/opal_list.h. >> >> I found that when debugging is enabled (as in almost every build I try) >> there is the following code: >> #if OPAL_ENABLE_DEBUG >> /* Spot check: ensure this item is only on the list that we >> just insertted it into */ >> >> (void)opal_atomic_add( &(item->opal_list_item_refcount), 1 ); >> assert(1 == item->opal_list_item_refcount); >> item->opal_list_item_belong_to = list; >> #endif >> >> I am not sure why (and it may be an AIX-specific issue), but that >> "opal_atomic_add()" is attempting a 64-bit add. >> That is a problem, given that 'opal_list_item_refcount' is 32-bits! >> >> Noting that all other accesses to this field are OPAL_THREAD_ADD32(), I >> suggest the following (with a bonus spell-check at no additional charge): >> >> --- opal/class/opal_list.c~ 2016-05-10 10:20:19.000000000 -0700 >> +++ opal/class/opal_list.c 2016-05-10 10:29:14.000000000 -0700 >> @@ -142,9 +142,9 @@ >> >> #if OPAL_ENABLE_DEBUG >> /* Spot check: ensure this item is only on the list that we >> - just insertted it into */ >> + just inserted it into */ >> >> - (void)opal_atomic_add( &(item->opal_list_item_refcount), 1 ); >> + (void)OPAL_THREAD_ADD32( &(item->opal_list_item_refcount), 1 ); >> assert(1 == item->opal_list_item_refcount); >> item->opal_list_item_belong_to = list; >> #endif >> >> >> Source inspection shows the same mixing or opal_atomic_add() vs >> OPAL_THREAD_ADD32() in master. >> >> -Paul >> >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Computer Languages & Systems Software (CLaSS) Group >> Computer Science Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/05/18952.php >> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/05/18953.php > -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900