Re: [OMPI devel] application hangs with multiple dup

2009-09-23 Thread Chris Samuel
Hi Terry, - "Terry Dontje" wrote: > It's actually is in the 1.3 branch now and has been > verified to solve the hanging issues of several members. Great, I'll get them to try a snapshot build! cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager

Re: [OMPI devel] application hangs with multiple dup

2009-09-23 Thread Terry Dontje
Chris Samuel wrote: Hi Edgar, - "Edgar Gabriel" wrote: it will be available in 1.3.4... That's great, thanks so much! cheers, Chris It's actually is in the 1.3 branch now and has been verified to solve the hanging issues of several members. --td

Re: [OMPI devel] application hangs with multiple dup

2009-09-23 Thread Chris Samuel
Hi Edgar, - "Edgar Gabriel" wrote: > it will be available in 1.3.4... That's great, thanks so much! cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053,

Re: [OMPI devel] application hangs with multiple dup

2009-09-22 Thread Edgar Gabriel
it will be available in 1.3.4... Thanks Edgar Chris Samuel wrote: Hi Edgar, - "Edgar Gabriel" wrote: just wanted to give a heads-up that I *think* I know what the problem is. I should have a fix (with a description) either later today or tomorrow morning... I see

Re: [OMPI devel] application hangs with multiple dup

2009-09-22 Thread Chris Samuel
Hi Edgar, - "Edgar Gabriel" wrote: > just wanted to give a heads-up that I *think* I know what the problem > is. I should have a fix (with a description) either later today or > tomorrow morning... I see that changeset 21970 is on trunk to fix this issue, is that

Re: [OMPI devel] application hangs with multiple dup

2009-09-16 Thread Edgar Gabriel
just wanted to give a heads-up that I *think* I know what the problem is. I should have a fix (with a description) either later today or tomorrow morning... Thanks Edgar Edgar Gabriel wrote: so I can confirm that I can reproduce the hang, and we (George, Rainer and me) have looked into that

Re: [OMPI devel] application hangs with multiple dup

2009-09-16 Thread Edgar Gabriel
there is a ticket on that topic already (#2009), and I just added some comments to that... Jeff Squyres wrote: On Sep 10, 2009, at 7:12 PM, Edgar Gabriel wrote: so I can confirm that I can reproduce the hang, and we (George, Rainer and me) have looked into that and are continue digging. I

Re: [OMPI devel] application hangs with multiple dup

2009-09-15 Thread Jeff Squyres
On Sep 10, 2009, at 7:12 PM, Edgar Gabriel wrote: so I can confirm that I can reproduce the hang, and we (George, Rainer and me) have looked into that and are continue digging. I hate to say that, but it looked to us as if messages were 'lost' (sender clearly called send and but the data is

Re: [OMPI devel] application hangs with multiple dup

2009-09-15 Thread Thomas Ropars
Hi, Some news about that bug ? Thomas Edgar Gabriel wrote: so I can confirm that I can reproduce the hang, and we (George, Rainer and me) have looked into that and are continue digging. I hate to say that, but it looked to us as if messages were 'lost' (sender clearly called send and but

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Edgar Gabriel
so I can confirm that I can reproduce the hang, and we (George, Rainer and me) have looked into that and are continue digging. I hate to say that, but it looked to us as if messages were 'lost' (sender clearly called send and but the data is not in any of the queues on the receiver side),

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Thomas Ropars
Edgar Gabriel wrote: Two short questions: do you have any open MPI mca parameters set in a file or at runtime? No And second, is there any difference if you disable the hierarch coll module (which does communicate additionally as well?) e.g. mpirun --mca coll ^hierarch -np 4 ./mytest No,

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Edgar Gabriel
Two short questions: do you have any open MPI mca parameters set in a file or at runtime? And second, is there any difference if you disable the hierarch coll module (which does communicate additionally as well?) e.g. mpirun --mca coll ^hierarch -np 4 ./mytest Thanks Edgar Thomas Ropars

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Thomas Ropars
Ashley Pittman wrote: On Wed, 2009-09-09 at 17:44 +0200, Thomas Ropars wrote: Thank you. I think you missed the top three lines of the output but that doesn't matter. main() at ?:? PMPI_Comm_dup() at pcomm_dup.c:62 ompi_comm_dup() at communicator/comm.c:661 -

Re: [OMPI devel] application hangs with multiple dup

2009-09-09 Thread Ashley Pittman
On Wed, 2009-09-09 at 17:44 +0200, Thomas Ropars wrote: Thank you. I think you missed the top three lines of the output but that doesn't matter. > main() at ?:? > PMPI_Comm_dup() at pcomm_dup.c:62 > ompi_comm_dup() at communicator/comm.c:661 > - > [0,2] (2

Re: [OMPI devel] application hangs with multiple dup

2009-09-09 Thread Thomas Ropars
Ashley Pittman wrote: On Tue, 2009-09-08 at 15:00 +0200, Thomas Ropars wrote: Hi, I'm working on r21949 of the trunk. When I run on a single node with 4 processes this simple program calling 2 times MPI_Comm_dup , the processes hang from time to time in the 2nd dup. I can't