Re: [OMPI users] Datatype construction, serious limitation (was: Signal: Segmentation fault (11) Problem)

2007-04-19 Thread Michael Gauckler (mailing lists)
Hi George, Thank you for the prompt reply. Indeed we are constructing a data-type description with more than 32k entries. I attached a screenshot of the pData structure (displayed with the TotalView debugger), I hope this helps you. Unfortunately I was not able to use gdb to execute the call you

Re: [OMPI users] How to force OpenMPI to use specific interconnect

2007-04-19 Thread stephen mulcahy
Jeff Squyres wrote: That's truly odd -- I can't imagine why you wouldn't get the TCP transport with the above command line. But the latencies, as you mentioned, are far too low for TCP. To be absolutely certain that you're not getting the IB transport, go to the $prefix/lib/openmpi direct

Re: [OMPI users] How to force OpenMPI to use specific interconnect

2007-04-19 Thread Jeff Squyres
Yes, this is sounding more mysterious. Please send the output listed here: http://www.open-mpi.org/community/help/ On Apr 19, 2007, at 8:15 AM, stephen mulcahy wrote: Jeff Squyres wrote: That's truly odd -- I can't imagine why you wouldn't get the TCP transport with the above command

[OMPI users] new installation problem

2007-04-19 Thread Babu Bhai
hi, Hi, I have migrated from LAM/MPI to OpenMPI. I am not able to execute simple mpi code in which master sends an integer to slave. If i execute code on single machine i.e start 2 instance on same machine (mpirun -np 2 hello) this works fine. If i execute in cluster using mpirun --pr

[OMPI users] peruse MSG_ARRIVED events lost

2007-04-19 Thread Harald Servat
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, I'm interested on gathering MSG_ARRIVED events through the PERUSE API offered by OpenMPI 1.2. I've written an small MPIC C program that performs some communication, and although I receive some MSG_ARRIVED events, however I'm loosing some e

Re: [OMPI users] new installation problem

2007-04-19 Thread Jeff Squyres
I need to make that error string be google-able -- I'll add it to the faq. :-) The problem is likely that you have multiple IP addresses, some of which are not routable to each other (but fail OMPI's routability assumptions). Check out these FAQ entries: http://www.open-mpi.org/faq/?cat

Re: [OMPI users] peruse MSG_ARRIVED events lost

2007-04-19 Thread George Bosilca
Harald, I check the PERUSE code which generate the MSG_ARRIVED event. There seems to be no way to miss one of this events if the following conditions are respected: - the communicator where the message arrive has the MSG_ARRIVED event attached - if this event is active. If you can provi

Re: [OMPI users] Datatype construction, serious limitation (was: Signal: Segmentation fault (11) Problem)

2007-04-19 Thread George Bosilca
Michael, Based on the image you sent your data-type look gigantic. There are 750K predefined type descriptions in your data-type, for a size of 12MB and an extent of 68MB. The data-type engine managed to optimize your description down to 540K predefined type descriptions. Which is still w

Re: [OMPI users] How to force OpenMPI to use specific interconnect

2007-04-19 Thread stephen mulcahy
Hi, I only have access to this test system for another 24 hours or so so I'm not sure it's worth any more of your efforts. Coupled with the fact that I don't have root on the system in question it could be more work too figure out whats going on than its worth. Thanks for your help so far,

Re: [OMPI users] How to force OpenMPI to use specific interconnect

2007-04-19 Thread Jeff Squyres
Sorry we couldn't figure out it -- let us know if you resume your Open MPI testing. On Apr 19, 2007, at 6:24 PM, stephen mulcahy wrote: Hi, I only have access to this test system for another 24 hours or so so I'm not sure it's worth any more of your efforts. Coupled with the fact that

Re: [OMPI users] new installation problem

2007-04-19 Thread Babu Bhai
Hi, I have already seen this faq. Nodes in cluster does not have multiple IP addresses. One thing i forgot to mention is that systems in cluster does not have static IPs and get IP address through DHCP. Also if there is a print statement (printf("hello world\n"); ) in slave it is correctly