Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-18 Thread Jeff Squyres
Thanks George.  I assume we need this in 1.4.2 and 1.5, right?

On Feb 17, 2010, at 6:15 PM, George Bosilca wrote:

> I usually prefer the expanded notation:
> 
> unsigned char ret;
> __asm__ __volatile__ (
>   "lock; cmpxchgl %3,%4   \n\t"
>   "  sete %0  \n\t"
>   : "=qm" (ret), "=a" (oldval), "=m" (*addr)
>   : "q"(newval), "m"(*addr), "1"(oldval)
>   : "memory", "cc");
> 
> return (int)ret;
> }
> 
> as it shows more clearly the input and output registers. But your version 
> does exactly the same thing. I'll commit shortly.
> 
> Thanks,
>   george.
> 
> On Feb 10, 2010, at 10:55 , Ake Sandgren wrote:
> 
> > On Wed, 2010-02-10 at 08:42 -0700, Barrett, Brian W wrote:
> >> Adding the memory and cc will certainly do no harm, and someone tried to 
> >> remove them as an optimization.  I wouldn't change the input and output 
> >> lines - the differences are mainly syntactic sugar.
> >
> > Gcc actually didn't like the example i sent earlier.
> > Another iteration gave this as a working (gcc/intel/pgi/pathscale works)
> > code.
> >
> > static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> > int32_t oldval, int32_t newval)
> > {
> >unsigned char ret;
> >__asm__ __volatile__ (
> >SMPLOCK "cmpxchgl %3,%2   \n\t"
> >"sete %0  \n\t"
> >: "=qm" (ret), "+a" (oldval), "+m" (*addr)
> >: "q"(newval)
> >: "memory", "cc");
> >
> >return (int)ret;
> > }
> >
> > --
> > Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
> > Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
> > Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-17 Thread George Bosilca
I usually prefer the expanded notation:

unsigned char ret;
__asm__ __volatile__ (
  "lock; cmpxchgl %3,%4   \n\t"
  "  sete %0  \n\t"
  : "=qm" (ret), "=a" (oldval), "=m" (*addr)
  : "q"(newval), "m"(*addr), "1"(oldval)
  : "memory", "cc");

return (int)ret;
}

as it shows more clearly the input and output registers. But your version does 
exactly the same thing. I'll commit shortly.

Thanks,
  george.

On Feb 10, 2010, at 10:55 , Ake Sandgren wrote:

> On Wed, 2010-02-10 at 08:42 -0700, Barrett, Brian W wrote:
>> Adding the memory and cc will certainly do no harm, and someone tried to 
>> remove them as an optimization.  I wouldn't change the input and output 
>> lines - the differences are mainly syntactic sugar.
> 
> Gcc actually didn't like the example i sent earlier.
> Another iteration gave this as a working (gcc/intel/pgi/pathscale works)
> code.
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> int32_t oldval, int32_t newval)
> {
>unsigned char ret;
>__asm__ __volatile__ (
>SMPLOCK "cmpxchgl %3,%2   \n\t"
>"sete %0  \n\t"
>: "=qm" (ret), "+a" (oldval), "+m" (*addr)
>: "q"(newval)
>: "memory", "cc");
> 
>return (int)ret;
> }
> 
> -- 
> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
> Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
On Wed, 2010-02-10 at 08:42 -0700, Barrett, Brian W wrote:
> Adding the memory and cc will certainly do no harm, and someone tried to 
> remove them as an optimization.  I wouldn't change the input and output lines 
> - the differences are mainly syntactic sugar.

Gcc actually didn't like the example i sent earlier.
Another iteration gave this as a working (gcc/intel/pgi/pathscale works)
code.

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
 int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%2   \n\t"
"sete %0  \n\t"
: "=qm" (ret), "+a" (oldval), "+m" (*addr)
: "q"(newval)
: "memory", "cc");

return (int)ret;
}

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Barrett, Brian W
Adding the memory and cc will certainly do no harm, and someone tried to remove 
them as an optimization.  I wouldn't change the input and output lines - the 
differences are mainly syntactic sugar.

Brian

On Feb 10, 2010, at 7:04 AM, Ake Sandgren wrote:

> On Wed, 2010-02-10 at 08:21 -0500, Jeff Squyres wrote:
>> On Feb 10, 2010, at 7:47 AM, Ake Sandgren wrote:
>> 
>>> According to people who knows asm statements fairly well (compiler
>>> developers), it should be
>> 
>>> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>>> int32_t oldval, int32_t newval)
>>> {
>>>unsigned char ret;
>>>__asm__ __volatile__ (
>>>SMPLOCK "cmpxchgl %3,%2   \n\t"
>>>"sete %0  \n\t"
>>>: "=qm" (ret), "=a" (oldval), "=m" (*addr)
>>>: "q"(newval), "2"(*addr), "1"(oldval)
>>>: "memory", "cc");
>>> 
>>>return (int)ret;
>>> }
>> 
>> Disclaimer: I know almost nothing about assembly.
>> 
>> I know that OMPI's asm is a carefully crafted set of assembly that works 
>> across a broad range of compilers.  So what might not be "quite right" for 
>> one compiler may actually be there because another compiler needs it.
>> 
>> That being said, if the changes above are for correctness, not 
>> neatness/style/etc., I can't speak for that...
> 
> The above should be correct for gcc style unless i misunderstood them.
> 
> Quoting from their reply:
> 'it should be "memory", "cc" since you also have to tell gcc you're
> clobbering the EFLAGS'
> 
> And i don't know asm either so...






Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
On Wed, 2010-02-10 at 08:21 -0500, Jeff Squyres wrote:
> On Feb 10, 2010, at 7:47 AM, Ake Sandgren wrote:
> 
> > According to people who knows asm statements fairly well (compiler
> > developers), it should be
> 
> > static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> >  int32_t oldval, int32_t newval)
> > {
> > unsigned char ret;
> > __asm__ __volatile__ (
> > SMPLOCK "cmpxchgl %3,%2   \n\t"
> > "sete %0  \n\t"
> > : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> > : "q"(newval), "2"(*addr), "1"(oldval)
> > : "memory", "cc");
> > 
> > return (int)ret;
> > }
> 
> Disclaimer: I know almost nothing about assembly.
> 
> I know that OMPI's asm is a carefully crafted set of assembly that works 
> across a broad range of compilers.  So what might not be "quite right" for 
> one compiler may actually be there because another compiler needs it.
> 
> That being said, if the changes above are for correctness, not 
> neatness/style/etc., I can't speak for that...

The above should be correct for gcc style unless i misunderstood them.

Quoting from their reply:
'it should be "memory", "cc" since you also have to tell gcc you're
clobbering the EFLAGS'

And i don't know asm either so...

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Jeff Squyres
On Feb 10, 2010, at 7:47 AM, Ake Sandgren wrote:

> According to people who knows asm statements fairly well (compiler
> developers), it should be

> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>  int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%2   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "2"(*addr), "1"(oldval)
> : "memory", "cc");
> 
> return (int)ret;
> }

Disclaimer: I know almost nothing about assembly.

I know that OMPI's asm is a carefully crafted set of assembly that works across 
a broad range of compilers.  So what might not be "quite right" for one 
compiler may actually be there because another compiler needs it.

That being said, if the changes above are for correctness, not 
neatness/style/etc., I can't speak for that...

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
On Tue, 2010-02-09 at 14:44 -0800, Mostyn Lewis wrote:
> The old opal_atomic_cmpset_32 worked:
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %1,%2   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret)
> : "q"(newval), "m"(*addr), "a"(oldval)
> : "memory");
> 
> return (int)ret; 
> }
> 
> The new opal_atomic_cmpset_32 fails:
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>  int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%4   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "m"(*addr), "1"(oldval)
> return (int)ret;
> }
> 
> **However** if you put back the "clobber" for memory line (3rd :), it works:
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>  int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%4   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "m"(*addr), "1"(oldval)
> : "memory");
> 
> return (int)ret;
> }
> 
> This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and 
> open64 (pathscale
> lineage - which also fails with 1.4.1).
> Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement 
> delimter - is
> that right? Seems to work with/without the ";".
> 
> 
> Also, a question - I see you generate via perl another "lock" asm file which 
> you put into
> opal/asm/generated/ and stick into 
> libasm - what you
> generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?

According to people who knows asm statements fairly well (compiler
developers), it should be
static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
 int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%2   \n\t"
"sete %0  \n\t"
: "=qm" (ret), "=a" (oldval), "=m" (*addr)
: "q"(newval), "2"(*addr), "1"(oldval)
: "memory", "cc");

return (int)ret;
}

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Terry Dontje

Jeff Squyres wrote:

Iain did the genius for the new assembly.  Iain -- can you respond?

  
Iain is on vacation right now so he probably want be able to respond 
until next week.


--td

On Feb 9, 2010, at 5:44 PM, Mostyn Lewis wrote:

  

The old opal_atomic_cmpset_32 worked:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %1,%2   \n\t"
"sete %0  \n\t"
: "=qm" (ret)
: "q"(newval), "m"(*addr), "a"(oldval)
: "memory");

return (int)ret;
}

The new opal_atomic_cmpset_32 fails:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
 int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%4   \n\t"
"sete %0  \n\t"
: "=qm" (ret), "=a" (oldval), "=m" (*addr)
: "q"(newval), "m"(*addr), "1"(oldval)
return (int)ret;
}

**However** if you put back the "clobber" for memory line (3rd :), it works:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
 int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%4   \n\t"
"sete %0  \n\t"
: "=qm" (ret), "=a" (oldval), "=m" (*addr)
: "q"(newval), "m"(*addr), "1"(oldval)
: "memory");

return (int)ret;
}

This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and open64 
(pathscale
lineage - which also fails with 1.4.1).
Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement 
delimter - is
that right? Seems to work with/without the ";".


Also, a question - I see you generate via perl another "lock" asm file which 
you put into
opal/asm/generated/ and stick into libasm 
- what you
generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:



Perhaps someone with a pathscale compiler support contract can investigate this 
with them.

Have them contact us if they want/need help understanding our atomics; we're 
happy to explain, etc. (the atomics are fairly localized to a small part of 
OMPI).



On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:

  

All,

FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually 
looping -

from gdb:

opal_progress_event_users_decrement () at 
../.././opal/include/opal/sys/atomic_impl.h:61
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
Current language:  auto; currently asm
(gdb) where
#0  opal_progress_event_users_decrement () at 
../.././opal/include/opal/sys/atomic_impl.h:61
#1  0x0001 in ?? ()
#2  0x2aec4cf6a5e0 in ?? ()
#3  0x00eb in ?? ()
#4  0x2aec4cfb57e0 in ompi_mpi_init () at 
../.././ompi/runtime/ompi_mpi_init.c:818
#5  0x7fff5db3bd58 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
56  {
57 int32_t oldval;
58
59 do {
60oldval = *addr;
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
62 return (oldval - delta);
63  }
64  #endif  /* OPAL_HAVE_ATOMIC_SUB_32 */
65
(gdb)

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:



FWIW, I have had terrible luck with the patschale compiler over the years.  
Repeated attempts to get support from them -- even when I was a paying customer 
-- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I 
filed years ago was never resolved).

Is this compiler even supported anymore?  I.e., is there a support department 
somewhere that you have a hope of getting any help from?

I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler 
bug.  You might want to attach to "hello world" in a debugger and see where 
it's hung.  You might need to compile OMPI with debugging symbols to get any meaningful 
information.

** NOTE: My personal feelings about the pathscale compiler suite do not reflect 
anyone else's feelings in the Open MPI community.  Perhaps someone could change 
my mind someday, but *I* have personally given up on this compiler.  :-(


On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:

  

Hello,

It does work with version 1.4. This is the hello world that hangs with
1.4.1:

#include 
#include 

int main(int argc, char **argv)
{
  int node, size;

  MPI_Init(,);
  MPI_Comm_rank(MPI_COMM_WORLD, );
  MPI_Comm_size(MPI_COMM_WORLD, );

  printf("Hello World from Node %d of %d.\n", node, size);

  MPI_Finalize();
  return 

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Iain Bason
Well, I am by no means an expert on the GNU-style asm directives.  I  
believe someone else (George Bosilca?) tweaked what I had suggested.


That being said, I think the memory "clobber" is harmless.

Iain

On Feb 9, 2010, at 5:51 PM, Jeff Squyres wrote:


Iain did the genius for the new assembly.  Iain -- can you respond?


On Feb 9, 2010, at 5:44 PM, Mostyn Lewis wrote:


The old opal_atomic_cmpset_32 worked:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
   unsigned char ret;
   __asm__ __volatile__ (
   SMPLOCK "cmpxchgl %1,%2   \n\t"
   "sete %0  \n\t"
   : "=qm" (ret)
   : "q"(newval), "m"(*addr), "a"(oldval)
   : "memory");

   return (int)ret;
}

The new opal_atomic_cmpset_32 fails:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t  
newval)

{
   unsigned char ret;
   __asm__ __volatile__ (
   SMPLOCK "cmpxchgl %3,%4   \n\t"
   "sete %0  \n\t"
   : "=qm" (ret), "=a" (oldval), "=m" (*addr)
   : "q"(newval), "m"(*addr), "1"(oldval)
   return (int)ret;
}

**However** if you put back the "clobber" for memory line (3rd :),  
it works:


static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t  
newval)

{
   unsigned char ret;
   __asm__ __volatile__ (
   SMPLOCK "cmpxchgl %3,%4   \n\t"
   "sete %0  \n\t"
   : "=qm" (ret), "=a" (oldval), "=m" (*addr)
   : "q"(newval), "m"(*addr), "1"(oldval)
   : "memory");

   return (int)ret;
}

This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc  
and open64 (pathscale

lineage - which also fails with 1.4.1).
Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as  
statement delimter - is

that right? Seems to work with/without the ";".


Also, a question - I see you generate via perl another "lock" asm  
file which you put into
opal/asm/generated/ and stick  
into libasm - what you
generate there for whatever usage hasn't changed 1.4->1.4.1->svn  
trunk?


DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:

Perhaps someone with a pathscale compiler support contract can  
investigate this with them.


Have them contact us if they want/need help understanding our  
atomics; we're happy to explain, etc. (the atomics are fairly  
localized to a small part of OMPI).




On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:


All,

FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn  
trunk) - actually looping -


from gdb:

opal_progress_event_users_decrement () at ../.././opal/include/ 
opal/sys/atomic_impl.h:61
61 } while (0 == opal_atomic_cmpset_32(addr, oldval,  
oldval - delta));

Current language:  auto; currently asm
(gdb) where
#0  opal_progress_event_users_decrement () at ../.././opal/ 
include/opal/sys/atomic_impl.h:61

#1  0x0001 in ?? ()
#2  0x2aec4cf6a5e0 in ?? ()
#3  0x00eb in ?? ()
#4  0x2aec4cfb57e0 in ompi_mpi_init () at ../.././ompi/ 
runtime/ompi_mpi_init.c:818

#5  0x7fff5db3bd58 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt  
stack?)

(gdb) list
56  {
57 int32_t oldval;
58
59 do {
60oldval = *addr;
61 } while (0 == opal_atomic_cmpset_32(addr, oldval,  
oldval - delta));

62 return (oldval - delta);
63  }
64  #endif  /* OPAL_HAVE_ATOMIC_SUB_32 */
65
(gdb)

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:

FWIW, I have had terrible luck with the patschale compiler over  
the years.  Repeated attempts to get support from them -- even  
when I was a paying customer -- resulted in no help (e.g., a  
pathCC bug with the OMPI C++ bindings that I filed years ago was  
never resolved).


Is this compiler even supported anymore?  I.e., is there a  
support department somewhere that you have a hope of getting any  
help from?


I can't say for sure, of course, but if MPI hello world hangs,  
it smells like a compiler bug.  You might want to attach to  
"hello world" in a debugger and see where it's hung.  You might  
need to compile OMPI with debugging symbols to get any  
meaningful information.


** NOTE: My personal feelings about the pathscale compiler suite  
do not reflect anyone else's feelings in the Open MPI  
community.  Perhaps someone could change my mind someday, but  
*I* have personally given up on this compiler.  :-(



On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:


Hello,

It does work with version 1.4. This is the hello world that  
hangs with

1.4.1:

#include 
#include 

int main(int argc, char **argv)
{
 int node, size;

 MPI_Init(,);
 

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Jeff Squyres
Iain did the genius for the new assembly.  Iain -- can you respond?


On Feb 9, 2010, at 5:44 PM, Mostyn Lewis wrote:

> The old opal_atomic_cmpset_32 worked:
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %1,%2   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret)
> : "q"(newval), "m"(*addr), "a"(oldval)
> : "memory");
> 
> return (int)ret;
> }
> 
> The new opal_atomic_cmpset_32 fails:
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>  int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%4   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "m"(*addr), "1"(oldval)
> return (int)ret;
> }
> 
> **However** if you put back the "clobber" for memory line (3rd :), it works:
> 
> static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
>  int32_t oldval, int32_t newval)
> {
> unsigned char ret;
> __asm__ __volatile__ (
> SMPLOCK "cmpxchgl %3,%4   \n\t"
> "sete %0  \n\t"
> : "=qm" (ret), "=a" (oldval), "=m" (*addr)
> : "q"(newval), "m"(*addr), "1"(oldval)
> : "memory");
> 
> return (int)ret;
> }
> 
> This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and 
> open64 (pathscale
> lineage - which also fails with 1.4.1).
> Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement 
> delimter - is
> that right? Seems to work with/without the ";".
> 
> 
> Also, a question - I see you generate via perl another "lock" asm file which 
> you put into
> opal/asm/generated/ and stick into 
> libasm - what you
> generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?
> 
> DM
> 
> On Tue, 9 Feb 2010, Jeff Squyres wrote:
> 
> > Perhaps someone with a pathscale compiler support contract can investigate 
> > this with them.
> >
> > Have them contact us if they want/need help understanding our atomics; 
> > we're happy to explain, etc. (the atomics are fairly localized to a small 
> > part of OMPI).
> >
> >
> >
> > On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:
> >
> >> All,
> >>
> >> FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - 
> >> actually looping -
> >>
> >> from gdb:
> >>
> >> opal_progress_event_users_decrement () at 
> >> ../.././opal/include/opal/sys/atomic_impl.h:61
> >> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - 
> >> delta));
> >> Current language:  auto; currently asm
> >> (gdb) where
> >> #0  opal_progress_event_users_decrement () at 
> >> ../.././opal/include/opal/sys/atomic_impl.h:61
> >> #1  0x0001 in ?? ()
> >> #2  0x2aec4cf6a5e0 in ?? ()
> >> #3  0x00eb in ?? ()
> >> #4  0x2aec4cfb57e0 in ompi_mpi_init () at 
> >> ../.././ompi/runtime/ompi_mpi_init.c:818
> >> #5  0x7fff5db3bd58 in ?? ()
> >> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> >> (gdb) list
> >> 56  {
> >> 57 int32_t oldval;
> >> 58
> >> 59 do {
> >> 60oldval = *addr;
> >> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - 
> >> delta));
> >> 62 return (oldval - delta);
> >> 63  }
> >> 64  #endif  /* OPAL_HAVE_ATOMIC_SUB_32 */
> >> 65
> >> (gdb)
> >>
> >> DM
> >>
> >> On Tue, 9 Feb 2010, Jeff Squyres wrote:
> >>
> >>> FWIW, I have had terrible luck with the patschale compiler over the 
> >>> years.  Repeated attempts to get support from them -- even when I was a 
> >>> paying customer -- resulted in no help (e.g., a pathCC bug with the OMPI 
> >>> C++ bindings that I filed years ago was never resolved).
> >>>
> >>> Is this compiler even supported anymore?  I.e., is there a support 
> >>> department somewhere that you have a hope of getting any help from?
> >>>
> >>> I can't say for sure, of course, but if MPI hello world hangs, it smells 
> >>> like a compiler bug.  You might want to attach to "hello world" in a 
> >>> debugger and see where it's hung.  You might need to compile OMPI with 
> >>> debugging symbols to get any meaningful information.
> >>>
> >>> ** NOTE: My personal feelings about the pathscale compiler suite do not 
> >>> reflect anyone else's feelings in the Open MPI community.  Perhaps 
> >>> someone could change my mind someday, but *I* have personally given up on 
> >>> this compiler.  :-(
> >>>
> >>>
> >>> On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
> >>>
>  Hello,

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Mostyn Lewis

The old opal_atomic_cmpset_32 worked:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
   unsigned char ret;
   __asm__ __volatile__ (
   SMPLOCK "cmpxchgl %1,%2   \n\t"
   "sete %0  \n\t"
   : "=qm" (ret)
   : "q"(newval), "m"(*addr), "a"(oldval)
   : "memory");

   return (int)ret; 
}


The new opal_atomic_cmpset_32 fails:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t newval)
{
   unsigned char ret;
   __asm__ __volatile__ (
   SMPLOCK "cmpxchgl %3,%4   \n\t"
   "sete %0  \n\t"
   : "=qm" (ret), "=a" (oldval), "=m" (*addr)
   : "q"(newval), "m"(*addr), "1"(oldval)
   return (int)ret;
}

**However** if you put back the "clobber" for memory line (3rd :), it works:

static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t newval)
{
   unsigned char ret;
   __asm__ __volatile__ (
   SMPLOCK "cmpxchgl %3,%4   \n\t"
   "sete %0  \n\t"
   : "=qm" (ret), "=a" (oldval), "=m" (*addr)
   : "q"(newval), "m"(*addr), "1"(oldval)
   : "memory");

   return (int)ret;
}

This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and open64 
(pathscale
lineage - which also fails with 1.4.1).
Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement 
delimter - is
that right? Seems to work with/without the ";".


Also, a question - I see you generate via perl another "lock" asm file which 
you put into
opal/asm/generated/ and stick into libasm 
- what you
generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:


Perhaps someone with a pathscale compiler support contract can investigate this 
with them.

Have them contact us if they want/need help understanding our atomics; we're 
happy to explain, etc. (the atomics are fairly localized to a small part of 
OMPI).



On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:


All,

FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually 
looping -

from gdb:

opal_progress_event_users_decrement () at 
../.././opal/include/opal/sys/atomic_impl.h:61
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
Current language:  auto; currently asm
(gdb) where
#0  opal_progress_event_users_decrement () at 
../.././opal/include/opal/sys/atomic_impl.h:61
#1  0x0001 in ?? ()
#2  0x2aec4cf6a5e0 in ?? ()
#3  0x00eb in ?? ()
#4  0x2aec4cfb57e0 in ompi_mpi_init () at 
../.././ompi/runtime/ompi_mpi_init.c:818
#5  0x7fff5db3bd58 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
56  {
57 int32_t oldval;
58
59 do {
60oldval = *addr;
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
62 return (oldval - delta);
63  }
64  #endif  /* OPAL_HAVE_ATOMIC_SUB_32 */
65
(gdb)

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:


FWIW, I have had terrible luck with the patschale compiler over the years.  
Repeated attempts to get support from them -- even when I was a paying customer 
-- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I 
filed years ago was never resolved).

Is this compiler even supported anymore?  I.e., is there a support department 
somewhere that you have a hope of getting any help from?

I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler 
bug.  You might want to attach to "hello world" in a debugger and see where 
it's hung.  You might need to compile OMPI with debugging symbols to get any meaningful 
information.

** NOTE: My personal feelings about the pathscale compiler suite do not reflect 
anyone else's feelings in the Open MPI community.  Perhaps someone could change 
my mind someday, but *I* have personally given up on this compiler.  :-(


On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:


Hello,

It does work with version 1.4. This is the hello world that hangs with
1.4.1:

#include 
#include 

int main(int argc, char **argv)
{
  int node, size;

  MPI_Init(,);
  MPI_Comm_rank(MPI_COMM_WORLD, );
  MPI_Comm_size(MPI_COMM_WORLD, );

  printf("Hello World from Node %d of %d.\n", node, size);

  MPI_Finalize();
  return 0;
}

El mar, 26-01-2010 a las 03:57 -0500, ?ke Sandgren escribi?:

1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
1.4.1 yet)
2 - There is a bug in the pathscale compiler with -fPIC and -g that
generates incorrect dwarf2 data so debuggers get really confused and
will have 

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Åke Sandgren
On Tue, 2010-02-09 at 13:42 -0500, Jeff Squyres wrote:
> Perhaps someone with a pathscale compiler support contract can investigate 
> this with them.
> 
> Have them contact us if they want/need help understanding our atomics; we're 
> happy to explain, etc. (the atomics are fairly localized to a small part of 
> OMPI).

I will surely do that.
It will take a few days though due to lots of other work.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Jeff Squyres
Perhaps someone with a pathscale compiler support contract can investigate this 
with them.

Have them contact us if they want/need help understanding our atomics; we're 
happy to explain, etc. (the atomics are fairly localized to a small part of 
OMPI).



On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:

> All,
> 
> FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - 
> actually looping -
> 
> from gdb:
> 
> opal_progress_event_users_decrement () at 
> ../.././opal/include/opal/sys/atomic_impl.h:61
> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
> Current language:  auto; currently asm
> (gdb) where
> #0  opal_progress_event_users_decrement () at 
> ../.././opal/include/opal/sys/atomic_impl.h:61
> #1  0x0001 in ?? ()
> #2  0x2aec4cf6a5e0 in ?? ()
> #3  0x00eb in ?? ()
> #4  0x2aec4cfb57e0 in ompi_mpi_init () at 
> ../.././ompi/runtime/ompi_mpi_init.c:818
> #5  0x7fff5db3bd58 in ?? ()
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> (gdb) list
> 56  {
> 57 int32_t oldval;
> 58
> 59 do {
> 60oldval = *addr;
> 61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
> 62 return (oldval - delta);
> 63  }
> 64  #endif  /* OPAL_HAVE_ATOMIC_SUB_32 */
> 65
> (gdb)
> 
> DM
> 
> On Tue, 9 Feb 2010, Jeff Squyres wrote:
> 
> > FWIW, I have had terrible luck with the patschale compiler over the years.  
> > Repeated attempts to get support from them -- even when I was a paying 
> > customer -- resulted in no help (e.g., a pathCC bug with the OMPI C++ 
> > bindings that I filed years ago was never resolved).
> >
> > Is this compiler even supported anymore?  I.e., is there a support 
> > department somewhere that you have a hope of getting any help from?
> >
> > I can't say for sure, of course, but if MPI hello world hangs, it smells 
> > like a compiler bug.  You might want to attach to "hello world" in a 
> > debugger and see where it's hung.  You might need to compile OMPI with 
> > debugging symbols to get any meaningful information.
> >
> > ** NOTE: My personal feelings about the pathscale compiler suite do not 
> > reflect anyone else's feelings in the Open MPI community.  Perhaps someone 
> > could change my mind someday, but *I* have personally given up on this 
> > compiler.  :-(
> >
> >
> > On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
> >
> >> Hello,
> >>
> >> It does work with version 1.4. This is the hello world that hangs with
> >> 1.4.1:
> >>
> >> #include 
> >> #include 
> >>
> >> int main(int argc, char **argv)
> >> {
> >>   int node, size;
> >>
> >>   MPI_Init(,);
> >>   MPI_Comm_rank(MPI_COMM_WORLD, );
> >>   MPI_Comm_size(MPI_COMM_WORLD, );
> >>
> >>   printf("Hello World from Node %d of %d.\n", node, size);
> >>
> >>   MPI_Finalize();
> >>   return 0;
> >> }
> >>
> >> El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
> >>> 1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
> >>> 1.4.1 yet)
> >>> 2 - There is a bug in the pathscale compiler with -fPIC and -g that
> >>> generates incorrect dwarf2 data so debuggers get really confused and
> >>> will have BIG problems debugging the code. I'm chasing them to get a
> >>> fix...
> >>> 3 - Do you have an example code that have problems?
> >>
> >> --
> >> Rafael Arco Arredondo
> >> Centro de Servicios de Informática y Redes de Comunicaciones
> >> Universidad de Granada
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> >
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Mostyn Lewis

All,

FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually 
looping -

from gdb:

opal_progress_event_users_decrement () at 
../.././opal/include/opal/sys/atomic_impl.h:61
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
Current language:  auto; currently asm
(gdb) where
#0  opal_progress_event_users_decrement () at 
../.././opal/include/opal/sys/atomic_impl.h:61
#1  0x0001 in ?? ()
#2  0x2aec4cf6a5e0 in ?? ()
#3  0x00eb in ?? ()
#4  0x2aec4cfb57e0 in ompi_mpi_init () at 
../.././ompi/runtime/ompi_mpi_init.c:818
#5  0x7fff5db3bd58 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
56  {
57 int32_t oldval;
58 
59 do {

60oldval = *addr;
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
62 return (oldval - delta);
63  }
64  #endif  /* OPAL_HAVE_ATOMIC_SUB_32 */
65
(gdb)

DM

On Tue, 9 Feb 2010, Jeff Squyres wrote:


FWIW, I have had terrible luck with the patschale compiler over the years.  
Repeated attempts to get support from them -- even when I was a paying customer 
-- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I 
filed years ago was never resolved).

Is this compiler even supported anymore?  I.e., is there a support department 
somewhere that you have a hope of getting any help from?

I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler 
bug.  You might want to attach to "hello world" in a debugger and see where 
it's hung.  You might need to compile OMPI with debugging symbols to get any meaningful 
information.

** NOTE: My personal feelings about the pathscale compiler suite do not reflect 
anyone else's feelings in the Open MPI community.  Perhaps someone could change 
my mind someday, but *I* have personally given up on this compiler.  :-(


On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:


Hello,

It does work with version 1.4. This is the hello world that hangs with
1.4.1:

#include 
#include 

int main(int argc, char **argv)
{
  int node, size;

  MPI_Init(,);
  MPI_Comm_rank(MPI_COMM_WORLD, );
  MPI_Comm_size(MPI_COMM_WORLD, );

  printf("Hello World from Node %d of %d.\n", node, size);

  MPI_Finalize();
  return 0;
}

El mar, 26-01-2010 a las 03:57 -0500, ?ke Sandgren escribi?:

1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
1.4.1 yet)
2 - There is a bug in the pathscale compiler with -fPIC and -g that
generates incorrect dwarf2 data so debuggers get really confused and
will have BIG problems debugging the code. I'm chasing them to get a
fix...
3 - Do you have an example code that have problems?


--
Rafael Arco Arredondo
Centro de Servicios de Inform?tica y Redes de Comunicaciones
Universidad de Granada

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Ake Sandgren
On Tue, 2010-02-09 at 08:49 -0500, Jeff Squyres wrote:
> FWIW, I have had terrible luck with the patschale compiler over the years.  
> Repeated attempts to get support from them -- even when I was a paying 
> customer -- resulted in no help (e.g., a pathCC bug with the OMPI C++ 
> bindings that I filed years ago was never resolved).
> 
> Is this compiler even supported anymore?  I.e., is there a support department 
> somewhere that you have a hope of getting any help from?
> 
> I can't say for sure, of course, but if MPI hello world hangs, it smells like 
> a compiler bug.  You might want to attach to "hello world" in a debugger and 
> see where it's hung.  You might need to compile OMPI with debugging symbols 
> to get any meaningful information.
> 
> ** NOTE: My personal feelings about the pathscale compiler suite do not 
> reflect anyone else's feelings in the Open MPI community.  Perhaps someone 
> could change my mind someday, but *I* have personally given up on this 
> compiler.  :-(

Pathscale is not dead, in fact I'm talking to them more or less daily at
the moment. They have been restructuring since the demise of SciCortex
last year. I hope they will be able to release a new version fairly
soon.

In my opinion (working mostly with Fortran codes, shudder) it is the
best compiler around. Although they have had problems over the years in
coming out with fixes for bugs in a timely fashion.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Jeff Squyres
FWIW, I have had terrible luck with the patschale compiler over the years.  
Repeated attempts to get support from them -- even when I was a paying customer 
-- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I 
filed years ago was never resolved).

Is this compiler even supported anymore?  I.e., is there a support department 
somewhere that you have a hope of getting any help from?

I can't say for sure, of course, but if MPI hello world hangs, it smells like a 
compiler bug.  You might want to attach to "hello world" in a debugger and see 
where it's hung.  You might need to compile OMPI with debugging symbols to get 
any meaningful information.

** NOTE: My personal feelings about the pathscale compiler suite do not reflect 
anyone else's feelings in the Open MPI community.  Perhaps someone could change 
my mind someday, but *I* have personally given up on this compiler.  :-(


On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:

> Hello,
> 
> It does work with version 1.4. This is the hello world that hangs with
> 1.4.1:
> 
> #include 
> #include 
> 
> int main(int argc, char **argv)
> {
>   int node, size;
> 
>   MPI_Init(,);
>   MPI_Comm_rank(MPI_COMM_WORLD, );
>   MPI_Comm_size(MPI_COMM_WORLD, );
> 
>   printf("Hello World from Node %d of %d.\n", node, size);
> 
>   MPI_Finalize();
>   return 0;
> }
> 
> El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
> > 1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
> > 1.4.1 yet)
> > 2 - There is a bug in the pathscale compiler with -fPIC and -g that
> > generates incorrect dwarf2 data so debuggers get really confused and
> > will have BIG problems debugging the code. I'm chasing them to get a
> > fix...
> > 3 - Do you have an example code that have problems?
> 
> --
> Rafael Arco Arredondo
> Centro de Servicios de Informática y Redes de Comunicaciones
> Universidad de Granada
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-08 Thread Rafael Arco Arredondo
Hello,

It does work with version 1.4. This is the hello world that hangs with
1.4.1:

#include 
#include 

int main(int argc, char **argv)
{
  int node, size;

  MPI_Init(,);
  MPI_Comm_rank(MPI_COMM_WORLD, );
  MPI_Comm_size(MPI_COMM_WORLD, );

  printf("Hello World from Node %d of %d.\n", node, size);

  MPI_Finalize();
  return 0;
}

El mar, 26-01-2010 a las 03:57 -0500, Åke Sandgren escribió:
> 1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
> 1.4.1 yet)
> 2 - There is a bug in the pathscale compiler with -fPIC and -g that
> generates incorrect dwarf2 data so debuggers get really confused and
> will have BIG problems debugging the code. I'm chasing them to get a
> fix...
> 3 - Do you have an example code that have problems? 

-- 
Rafael Arco Arredondo
Centro de Servicios de Informática y Redes de Comunicaciones
Universidad de Granada



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-01-25 Thread Åke Sandgren
1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
1.4.1 yet)
2 - There is a bug in the pathscale compiler with -fPIC and -g that
generates incorrect dwarf2 data so debuggers get really confused and
will have BIG problems debugging the code. I'm chasing them to get a
fix...
3 - Do you have an example code that have problems?

On Mon, 2010-01-25 at 15:01 -0500, Jeff Squyres wrote:
> I'm afraid I don't have any clues offhand.  We *have* had problems with the 
> Pathscale compiler in the past that were never resolved by their support 
> crew.  However, they were of the "variables weren't initialized and the 
> process generally aborts" kind of failure, not a "persistent hang" kind of 
> failure.
> 
> Can you tell where in MPI_Init the process is hanging?  E.g., can you build 
> Open MPI with debugging enabled (such as by passing CFLAGS=-g to OMPI's 
> configure line) and then attach a debugger to a hung process and see what 
> it's stuck on?
> 
> 
> On Jan 25, 2010, at 7:52 AM, Rafael Arco Arredondo wrote:
> 
> > Hello:
> > 
> > I'm having some issues with Open MPI 1.4.1 and Pathscale compiler
> > (version 3.2). Open MPI builds successfully with the following configure
> > arguments:
> > 
> > ./configure --with-openib=/usr --with-openib-libdir=/usr/lib64
> > --with-sge --enable-static CC=pathcc CXX=pathCC F77=pathf90 F90=pathf90
> > FC=pathf90
> > 
> > (we have OpenFabrics 1.2 Infiniband drivers, by the way)
> > 
> > However, applications hang on MPI_Init (or maybe MPI_Comm_rank or
> > MPI_Comm_size, a basic hello-world anyway doesn't print 'Hello World
> > from node...'). I tried running them with and without SGE. Same result.
> > 
> > This hello-world works flawlessly when I build Open MPI with gcc:
> > 
> > ./configure --with-openib=/usr --with-openib-libdir=/usr/lib64
> > --with-sge --enable-static
> > 
> > This successful execution runs in one machine only, so it shouldn't use
> > Infiniband, and it also works when several nodes are used.
> > 
> > I was able to build previous versions of Open MPI with Pathscale (1.2.6
> > and 1.3.2, particularly). I tried building version 1.4.1 both with
> > Pathscale 3.2 and Pathscale 3.1. No difference.
> > 
> > Any ideas?
> > 
> > Thank you in advance,
> > 
> > Rafa
> > 
> > --
> > Rafael Arco Arredondo
> > Centro de Servicios de Informática y Redes de Comunicaciones
> > Universidad de Granada
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> 
> 



Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-01-25 Thread Jeff Squyres
I'm afraid I don't have any clues offhand.  We *have* had problems with the 
Pathscale compiler in the past that were never resolved by their support crew.  
However, they were of the "variables weren't initialized and the process 
generally aborts" kind of failure, not a "persistent hang" kind of failure.

Can you tell where in MPI_Init the process is hanging?  E.g., can you build 
Open MPI with debugging enabled (such as by passing CFLAGS=-g to OMPI's 
configure line) and then attach a debugger to a hung process and see what it's 
stuck on?


On Jan 25, 2010, at 7:52 AM, Rafael Arco Arredondo wrote:

> Hello:
> 
> I'm having some issues with Open MPI 1.4.1 and Pathscale compiler
> (version 3.2). Open MPI builds successfully with the following configure
> arguments:
> 
> ./configure --with-openib=/usr --with-openib-libdir=/usr/lib64
> --with-sge --enable-static CC=pathcc CXX=pathCC F77=pathf90 F90=pathf90
> FC=pathf90
> 
> (we have OpenFabrics 1.2 Infiniband drivers, by the way)
> 
> However, applications hang on MPI_Init (or maybe MPI_Comm_rank or
> MPI_Comm_size, a basic hello-world anyway doesn't print 'Hello World
> from node...'). I tried running them with and without SGE. Same result.
> 
> This hello-world works flawlessly when I build Open MPI with gcc:
> 
> ./configure --with-openib=/usr --with-openib-libdir=/usr/lib64
> --with-sge --enable-static
> 
> This successful execution runs in one machine only, so it shouldn't use
> Infiniband, and it also works when several nodes are used.
> 
> I was able to build previous versions of Open MPI with Pathscale (1.2.6
> and 1.3.2, particularly). I tried building version 1.4.1 both with
> Pathscale 3.2 and Pathscale 3.1. No difference.
> 
> Any ideas?
> 
> Thank you in advance,
> 
> Rafa
> 
> --
> Rafael Arco Arredondo
> Centro de Servicios de Informática y Redes de Comunicaciones
> Universidad de Granada
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com