[OMPI devel] SM initialization race condition

2008-08-21 Thread Terry Dontje
I've been seeing an intermittent (once every 4 hours looping on a quick initialization program) segv with the following stack trace. =>[1] mca_btl_sm_add_procs(btl = 0xfd7ffdb67ef0, nprocs = 2U, procs = 0x591560, peers = 0x591580, reachability = 0xfd7fffdff000), line 519 in "btl_sm.c"

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread George Bosilca
Terry, We use the feature defined by POSIX mmap where the area should be zero- filled when the file length is extended. What OS you're using when you see such problems ? Just in case, here is a patch that set the beginning of the mmaped region to zero, in case this is not done

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread Jeff Squyres
IIRC, bzero is a gnu-ism. We should probably use memset instead. On Aug 21, 2008, at 5:40 AM, George Bosilca wrote: Terry, We use the feature defined by POSIX mmap where the area should be zero-filled when the file length is extended. What OS you're using when you see such problems ?

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread Brian W. Barrett
bzero is not a gnu-ism -- it's in POSIX.1. Either bzero or memset is correct and used throughout OMPI. Brian On Thu, 21 Aug 2008, Jeff Squyres wrote: IIRC, bzero is a gnu-ism. We should probably use memset instead. On Aug 21, 2008, at 5:40 AM, George Bosilca wrote: Terry, We use the

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread George Bosilca
bzero() function conforms to IEEE Std 1003.1-2001 (``POSIX.1'') memset() function conforms to ISO/IEC 9899:1990 (``ISO C90'') Both functions are in the libc, so it's definitively difficult to see which one is better. george. On Aug 21, 2008, at 3:32 PM, Jeff Squyres wrote: IIRC, bzero

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread Tim Mattox
Actually, bzero() is POSIX. Here is the history section of the bzero man page on Mac OS X 10.4: A bzero() function appeared in 4.3BSD. Its prototype existed previously in before it was moved to for IEEE Std 1003.1-2001 (``POSIX.1'') compliance. Hmmm, but the Linux man page says it

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread Terry Dontje
George Bosilca wrote: Terry, We use the feature defined by POSIX mmap where the area should be zero-filled when the file length is extended. What OS you're using when you see such problems ? So far I've only tested this on Solaris. We'll try out the bzero to see if this goes away. --td

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread Tim Mattox
A little google searching, and the best I can find is that memset is part of the C89/C90 standard, while bzero isn't. Thus memset would/should be supported even on non-POSIX systems. Also, the opengroup claims that the bzero is LEGACY and "This function may be withdrawn in a future version."