Re: [OMPI devel] PML selection logic

2008-06-29 Thread Lenny Verkhovsky
We can also make few different paramfiles for typical setups ( large cluster / minimum LT / max BW e.t.c ) the desired paramfile can be chosen by configure flag and be placed in * $prefix/etc/openmpi-mca-params.conf* On Sat, Jun 28, 2008 at 3:55 PM, Jeff Squyres wrote: >

Re: [OMPI devel] PML selection logic

2008-06-28 Thread Jeff Squyres
Agreed. I have a few ideas in this direction as well (random thoughts that might as well be transcribed somewhere): - some kind of configure --enable-large-system (whatever) option is a Good Thing - it would be good if the configure option simply set [MCA parameter?] defaults wherever

Re: [OMPI devel] PML selection logic

2008-06-26 Thread Ralph H Castain
Just to complete this thread... Brian raised a very good point, so we identified it on the weekly telecon as a subject that really should be discussed at next week's technical meeting. I think we can find a reasonable answer, but there are several ways it can be done. So rather than doing our

Re: [OMPI devel] PML selection logic

2008-06-24 Thread Ralph H Castain
It is a good point. What I have prototyped would still handle it - basically, it checks to see if any data has been published and does a modex if so. So if one side does send modex data, the other side will faithfully decode it. I think the bigger issue will be if both sides don't, and they don't

Re: [OMPI devel] PML selection logic

2008-06-24 Thread George Bosilca
Brian hinted a possible bug in one of his replies. How does this work in the case of dynamic processes? We can envision several scenarios, but lets take a simple: 2 jobs that get connected with connect/accept. One might publish the PML name (simply because the -mca argument was on) and one

Re: [OMPI devel] PML selection logic

2008-06-24 Thread Jeff Squyres
Also sounds good to me. Note that the most difficult part of the forward-looking plan is that we usually can't tell the difference between "something failed to initialize" and "you don't have support for feature X". I like the general philosophy of: running out of the box always works

Re: [OMPI devel] PML selection logic

2008-06-23 Thread Ralph H Castain
Okay, so let's explore an alternative that preserves the support you are seeking for the "ignorant user", but doesn't penalize everyone else. What we could do is simply set things up so that: 1. if -mca plm xyz is provided, then no modex data is added 2. if it is not provided, then only rank=0

Re: [OMPI devel] PML selection logic

2008-06-23 Thread Brian W. Barrett
The problem is that we default to OB1, but that's not the right choice for some platforms (like Pathscale / PSM), where there's a huge performance hit for using OB1. So we run into a situation where user installs Open MPI, starts running, gets horrible performance, bad mouths Open MPI, and

Re: [OMPI devel] PML selection logic

2008-06-23 Thread Ralph H Castain
My fault - I should be more precise in my language. ;-/ #1 is not adequate, IMHO, as it forces us to -always- do a modex. It seems to me that a simpler solution to what you describe is for the user to specify -mca pml ob1, or -mca pml cm. If the latter, then you could deal with the

Re: [OMPI devel] PML selection logic

2008-06-23 Thread Brian W. Barrett
The selection code was added because frequently high speed interconnects fail to initialize properly due to random stuff happening (yes, that's a horrible statement, but true). We ran into a situation with some really flaky machines where most of the processes would chose CM, but a couple

Re: [OMPI devel] PML selection logic

2008-06-23 Thread Aurélien Bouteiller
The first approach sounds fair enough to me. We should avoid 2 and 3 as the pml selection mechanism used to be more complex before we reduced it to accommodate a major design bug in the BTL selection process. When using the complete PML selection, BTL would be initialized several times,

[OMPI devel] PML selection logic

2008-06-23 Thread Ralph H Castain
Yo all I've been doing further research into the modex and came across something I don't fully understand. It seems we have each process insert into the modex the name of the PML module that it selected. Once the modex has exchanged that info, it then loops across all procs in the job to check