Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Ralph Castain
Hmmm...something isn't making sense. Can I see the command line you used to
generate this?

I'll tell you why I'm puzzled. If orte_debug_flag is set, then the
"--daemonize" should NOT be there, and you should see "--debug" on that
command line. What I see is the reverse, which implies to me that
orte_debug_flag is NOT being set to "true".

When I tested here and on odin, though, I found that the -d option correctly
set the flag and everything works just fine.

So there is something in your environment or setup that is messing up that
orte_debug_flag. I have no idea what it could be - the command line should
override anything in your environment, but you could check. Otherwise, if
this diagnostic output came from a command line that included -d or
--debug-devel, or had OMPI_MCA_orte_debug=1 in the environment, then I am at
a loss - everywhere I've tried it, it works fine.

Ralph



On 4/2/08 5:41 PM, "Jon Mason"  wrote:

> On Wednesday 02 April 2008 05:04:47 pm Ralph Castain wrote:
>> Here's a real simple diagnostic you can do: set -mca plm_base_verbose 1 and
>> look at the cmd line being executed (send it here). It will look like:
>> 
>> [[xxx,1],0] plm:rsh: executing: jjkljks;jldfsaj;
>> 
>> If the cmd line has --daemonize on it, then the ssh will close and xterm
>> won't work.
> 
> [vic20:01863] [[40388,0],0] plm:rsh: executing: (//usr/bin/ssh) [/usr/bin/ssh
> vic12 orted --daemonize -mca ess env -mca orte_ess_jobid 2646867968 -mca
> orte_ess_vpid 1 -mca orte_ess_num_procs
> 2 --hnp-uri 
> "2646867968.0;tcp://192.168.70.150:39057;tcp://10.10.0.150:39057;tcp://86.75.3
> 0.10:39057" --nodename
> vic12 -mca btl openib,self --mca btl_openib_receive_queues
> P,65536,256,128,128 -mca plm_base_verbose 1 -mca
> mca_base_param_file_path
> /usr/mpi/gcc/ompi-trunk/share/openmpi/amca-param-sets:/root -mca
> mca_base_param_file_path_force /root]
> 
> 
> It looks like what you say is happening.  Is this configured somewhere, so
> that I can remove it?
> 
> Thanks,
> Jon
> 
>> Ralph
>> 
>> On 4/2/08 3:14 PM, "Jeff Squyres"  wrote:
>>> Can you diagnose a little further:
>>> 
>>> 1. in the case where it works, can you verify that the ssh to launch
>>> the orteds is still running?
>>> 
>>> 2. in the case where it doesn't work, can you verify that the ssh to
>>> launch the orteds has actually died?
>>> 
>>> On Apr 2, 2008, at 4:58 PM, Jon Mason wrote:
 On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote:
> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote:
>> I remember that someone had found a bug that caused
>> orte_debug_flag to not
>> get properly set (local var covering over a global one) - could be
>> that
>> your tmp-public branch doesn't have that patch in it.
>> 
>> You might try updating to the latest trunk
> 
> I updated my ompi-trunk tree, did a clean build, and I still seem
> the same
> problem.  I regressed trunk to rev 17589 and everything works as I
> expect.
> So I think the problem is still there in the top of trunk.
 
 I stepped through the revs of trunk and found the first failing rev
 to be
 17632.  Its a big patch, so I'll defer to those more in the know to
 determine
 what is breaking in there.
 
> I don't discount user error, but I don't think I am doing anyting
> different.
> Did some setting change that perhaps I did not modify?
> 
> Thanks,
> Jon
> 
>> On 4/2/08 10:41 AM, "George Bosilca"  wrote:
>>> I'm using this feature on the trunk with the version from
>>> yesterday.
>>> It works without problems ...
>>> 
>>>   george.
>>> 
>>> On Apr 2, 2008, at 12:14 PM, Jon Mason wrote:
 On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:
> Are these r numbers relevant on the /tmp-public branch, or the
> trunk?
 
 I pulled it out of the command used to update the branch, which
 was:
 svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .
 
 In the cpc tmp branch, it happened at r17920.
 
 Thanks,
 Jon
 
> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
>> I regressed my tree and it looks like it happened between
>> 17590:17917
>> 
>> On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
>>> I am noticing that ssh seems to be broken on trunk (and my cpc
>>> branch, as
>>> it is based on trunk).  When I try to use xterm and gdb to
>>> debug, I
>>> only
>>> successfully get 1 xterm.  I have tried this on 2 different
>>> setups.  I can
>>> successfully get the xterm's on the 1.2 svn branch.
>>> 
>>> I am running the following command:
>>> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
>>> gdb 

[OMPI devel] Mercurial demo OMPI repository

2008-04-02 Thread Jeff Squyres
Thanks to the sysadmins at IU, I put up a sample Mercurial OMPI  
repository here:


http://www.open-mpi.org/hg/hgwebdir.cgi/

I converted the entire SVN ompi repository history (/trunk, /tags,  
and /branches only) as of r17921.  Note that it shows some commits on  
the 0.9 branch as the most recent activity only because it converts  
the branches in reverse order -- the entire trunk is there as of r17921.


You can clone this repository with the following:

hg clone 
http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/hg/ompi-svn-conversion-r17921/

Enjoy.

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Jon Mason
On Wednesday 02 April 2008 01:21:31 pm Jon Mason wrote:
> On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote:
> > I remember that someone had found a bug that caused orte_debug_flag to not
> > get properly set (local var covering over a global one) - could be that
> > your tmp-public branch doesn't have that patch in it.
> >
> > You might try updating to the latest trunk
> 
> I updated my ompi-trunk tree, did a clean build, and I still seem the same 
> problem.  I regressed trunk to rev 17589 and everything works as I expect.  
> So I think the problem is still there in the top of trunk.


I stepped through the revs of trunk and found the first failing rev to be
17632.  Its a big patch, so I'll defer to those more in the know to determine
what is breaking in there.


> I don't discount user error, but I don't think I am doing anyting different.  
> Did some setting change that perhaps I did not modify?
> 
> Thanks,
> Jon
> 
> > On 4/2/08 10:41 AM, "George Bosilca"  wrote:
> > > I'm using this feature on the trunk with the version from yesterday.
> > > It works without problems ...
> > >
> > >george.
> > >
> > > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote:
> > >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:
> > >>> Are these r numbers relevant on the /tmp-public branch, or the trunk?
> > >>
> > >> I pulled it out of the command used to update the branch, which was:
> > >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .
> > >>
> > >> In the cpc tmp branch, it happened at r17920.
> > >>
> > >> Thanks,
> > >> Jon
> > >>
> > >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
> >  I regressed my tree and it looks like it happened between
> >  17590:17917
> > 
> >  On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> > > I am noticing that ssh seems to be broken on trunk (and my cpc
> > > branch, as
> > > it is based on trunk).  When I try to use xterm and gdb to debug, I
> > > only
> > > successfully get 1 xterm.  I have tried this on 2 different
> > > setups.  I can
> > > successfully get the xterm's on the 1.2 svn branch.
> > >
> > > I am running the following command:
> > > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
> > > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1
> > >
> > > Is anyone else seeing this problem?
> > >
> > > Thanks,
> > > Jon
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > 
> >  ___
> >  devel mailing list
> >  de...@open-mpi.org
> >  http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > >>
> > >> ___
> > >> devel mailing list
> > >> de...@open-mpi.org
> > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > >
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 




Re: [OMPI devel] [PATCH] Fix compilation error without XRC

2008-04-02 Thread Jeff Squyres

Thanks; applied https://svn.open-mpi.org/trac/ompi/changeset/18076.


On Apr 2, 2008, at 8:21 AM, Bernhard Fischer wrote:

Hi,

* ompi/mca/btl/openib/btl_openib_component.c (init_one_hca):
mca_btl_openib_open_xrc_domain and
mca_btl_openib_close_xrc_domain depend on XRC

Fixes the compilation failure as in the head of attached patch.
TIA,
Bernhard
01.diff>___

devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Jeff Squyres

On Apr 2, 2008, at 1:58 PM, Gleb Natapov wrote:

No, I think it would be fine to only send the output after
btl_openib_if_in|exclude is applied.  Perhaps we need an MCA param to
say "always send everything" in the case that someone applies a non-
homogeneous if_in|exclude set of values...?

When is carto stuff applied?  Is that what you're really asking  
about?



There is no difference between carto and include/exclude.


You mean in terms of when they are applied?


I can specify
different openib_if_include values for different procs on the same  
host.



I know you *can*, but it is certainly uncommon.  The common case is  
that it's the same for all procs on all hosts.  I guess there's a few  
cases:


1. homogeneous include/exclude, no carto: send all in node info; no  
proc info
2. homogeneous include/exclude, carto is used: send all ports in node  
info; send index in proc info for which node info port index it will use
3. heterogeneous include/exclude, no cart: need user to tell us that  
this situation exists (e.g., use another MCA param), but then is same  
as #2

4. heterogeneous include/exclude, cart is used, same as #3

Right?

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [PATCH] Fix typo in configure helptext

2008-04-02 Thread Jeff Squyres
Thanks!  We have a general rule to not apply autogen-worthy changes  
during the US workday, so I'll commit this tonight.


On Apr 2, 2008, at 8:20 AM, Bernhard Fischer wrote:

Hi,

* config/ompi_configure_options.m4: Fix typo in helptext

Please apply.
TIA,
Bernhard
connectx.diff>___

devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Jon Mason
On Wednesday 02 April 2008 11:54:50 am Ralph H Castain wrote:
> I remember that someone had found a bug that caused orte_debug_flag to not
> get properly set (local var covering over a global one) - could be that
> your tmp-public branch doesn't have that patch in it.
>
> You might try updating to the latest trunk

I updated my ompi-trunk tree, did a clean build, and I still seem the same 
problem.  I regressed trunk to rev 17589 and everything works as I expect.  
So I think the problem is still there in the top of trunk.

I don't discount user error, but I don't think I am doing anyting different.  
Did some setting change that perhaps I did not modify?

Thanks,
Jon

> On 4/2/08 10:41 AM, "George Bosilca"  wrote:
> > I'm using this feature on the trunk with the version from yesterday.
> > It works without problems ...
> >
> >george.
> >
> > On Apr 2, 2008, at 12:14 PM, Jon Mason wrote:
> >> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:
> >>> Are these r numbers relevant on the /tmp-public branch, or the trunk?
> >>
> >> I pulled it out of the command used to update the branch, which was:
> >> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .
> >>
> >> In the cpc tmp branch, it happened at r17920.
> >>
> >> Thanks,
> >> Jon
> >>
> >>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
>  I regressed my tree and it looks like it happened between
>  17590:17917
> 
>  On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> > I am noticing that ssh seems to be broken on trunk (and my cpc
> > branch, as
> > it is based on trunk).  When I try to use xterm and gdb to debug, I
> > only
> > successfully get 1 xterm.  I have tried this on 2 different
> > setups.  I can
> > successfully get the xterm's on the 1.2 svn branch.
> >
> > I am running the following command:
> > mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
> > gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1
> >
> > Is anyone else seeing this problem?
> >
> > Thanks,
> > Jon
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
>  ___
>  devel mailing list
>  de...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Gleb Natapov
On Wed, Apr 02, 2008 at 12:08:47PM -0400, Jeff Squyres wrote:
> On Apr 2, 2008, at 11:13 AM, Gleb Natapov wrote:
> > On Wed, Apr 02, 2008 at 10:35:03AM -0400, Jeff Squyres wrote:
> >> If we use carto to limit hcas/ports are used on a given host on a  
> >> per-
> >> proc basis, then we can include some proc_send data to say "this proc
> >> only uses indexes X,Y,Z from the node data".  The indexes can be
> >> either uint8_ts, or maybe even a variable length bitmap.
> >>
> > So you propose that each proc will send info (using node_send())  
> > about every
> > hca/proc on a host even about those that are excluded from use by  
> > the proc
> > just in case? And then each proc will have to send additional info  
> > (using
> > proc_send() this time) to indicate what hcas/ports it is actually  
> > using?
> 
> 
> No, I think it would be fine to only send the output after  
> btl_openib_if_in|exclude is applied.  Perhaps we need an MCA param to  
> say "always send everything" in the case that someone applies a non- 
> homogeneous if_in|exclude set of values...?
> 
> When is carto stuff applied?  Is that what you're really asking about?
> 
There is no difference between carto and include/exclude. I can specify
different openib_if_include values for different procs on the same host.

--
Gleb.


Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Ralph H Castain
I remember that someone had found a bug that caused orte_debug_flag to not
get properly set (local var covering over a global one) - could be that your
tmp-public branch doesn't have that patch in it.

You might try updating to the latest trunk


On 4/2/08 10:41 AM, "George Bosilca"  wrote:

> I'm using this feature on the trunk with the version from yesterday.
> It works without problems ...
> 
>george.
> 
> On Apr 2, 2008, at 12:14 PM, Jon Mason wrote:
>> On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:
>>> Are these r numbers relevant on the /tmp-public branch, or the trunk?
>> 
>> I pulled it out of the command used to update the branch, which was:
>> svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .
>> 
>> In the cpc tmp branch, it happened at r17920.
>> 
>> Thanks,
>> Jon
>> 
>>> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
 I regressed my tree and it looks like it happened between
 17590:17917
 
 On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> I am noticing that ssh seems to be broken on trunk (and my cpc
> branch, as
> it is based on trunk).  When I try to use xterm and gdb to debug, I
> only
> successfully get 1 xterm.  I have tried this on 2 different
> setups.  I can
> successfully get the xterm's on the 1.2 svn branch.
> 
> I am running the following command:
> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1
> 
> Is anyone else seeing this problem?
> 
> Thanks,
> Jon
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
 
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread George Bosilca
I'm using this feature on the trunk with the version from yesterday.  
It works without problems ...


  george.

On Apr 2, 2008, at 12:14 PM, Jon Mason wrote:

On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:

Are these r numbers relevant on the /tmp-public branch, or the trunk?


I pulled it out of the command used to update the branch, which was:
svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .

In the cpc tmp branch, it happened at r17920.

Thanks,
Jon


On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
I regressed my tree and it looks like it happened between  
17590:17917


On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:

I am noticing that ssh seems to be broken on trunk (and my cpc
branch, as
it is based on trunk).  When I try to use xterm and gdb to debug, I
only
successfully get 1 xterm.  I have tried this on 2 different
setups.  I can
successfully get the xterm's on the 1.2 svn branch.

I am running the following command:
mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1

Is anyone else seeing this problem?

Thanks,
Jon
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Jon Mason
On Wednesday 02 April 2008 11:07:18 am Jeff Squyres wrote:
> Are these r numbers relevant on the /tmp-public branch, or the trunk?

I pulled it out of the command used to update the branch, which was:
svn merge -r 17590:17917 https://svn.open-mpi.org/svn/ompi/trunk .

In the cpc tmp branch, it happened at r17920.

Thanks,
Jon

> On Apr 2, 2008, at 11:59 AM, Jon Mason wrote:
> > I regressed my tree and it looks like it happened between 17590:17917
> >
> > On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> >> I am noticing that ssh seems to be broken on trunk (and my cpc
> >> branch, as
> >> it is based on trunk).  When I try to use xterm and gdb to debug, I
> >> only
> >> successfully get 1 xterm.  I have tried this on 2 different
> >> setups.  I can
> >> successfully get the xterm's on the 1.2 svn branch.
> >>
> >> I am running the following command:
> >> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
> >> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1
> >>
> >> Is anyone else seeing this problem?
> >>
> >> Thanks,
> >> Jon
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Jeff Squyres

On Apr 2, 2008, at 11:13 AM, Gleb Natapov wrote:

On Wed, Apr 02, 2008 at 10:35:03AM -0400, Jeff Squyres wrote:
If we use carto to limit hcas/ports are used on a given host on a  
per-

proc basis, then we can include some proc_send data to say "this proc
only uses indexes X,Y,Z from the node data".  The indexes can be
either uint8_ts, or maybe even a variable length bitmap.

So you propose that each proc will send info (using node_send())  
about every
hca/proc on a host even about those that are excluded from use by  
the proc
just in case? And then each proc will have to send additional info  
(using
proc_send() this time) to indicate what hcas/ports it is actually  
using?



No, I think it would be fine to only send the output after  
btl_openib_if_in|exclude is applied.  Perhaps we need an MCA param to  
say "always send everything" in the case that someone applies a non- 
homogeneous if_in|exclude set of values...?


When is carto stuff applied?  Is that what you're really asking about?

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Jon Mason
I regressed my tree and it looks like it happened between 17590:17917

On Wednesday 02 April 2008 10:22:52 am Jon Mason wrote:
> I am noticing that ssh seems to be broken on trunk (and my cpc branch, as
> it is based on trunk).  When I try to use xterm and gdb to debug, I only
> successfully get 1 xterm.  I have tried this on 2 different setups.  I can
> successfully get the xterm's on the 1.2 svn branch.
>
> I am running the following command:
> mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e
> gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1
>
> Is anyone else seeing this problem?
>
> Thanks,
> Jon
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] Ssh tunnelling broken in trunk?

2008-04-02 Thread Jon Mason
I am noticing that ssh seems to be broken on trunk (and my cpc branch, as it 
is based on trunk).  When I try to use xterm and gdb to debug, I only 
successfully get 1 xterm.  I have tried this on 2 different setups.  I can 
successfully get the xterm's on the 1.2 svn branch.  

I am running the following command:
mpirun --n 2 --host vic12,vic20 -mca btl tcp,self -d xterm -e 
gdb /usr/mpi/gcc/openmpi-1.2-svn/tests/IMB-3.0/IMB-MPI1

Is anyone else seeing this problem?

Thanks,
Jon


Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Ralph H Castain



On 4/2/08 8:52 AM, "Terry Dontje"  wrote:

> Jeff Squyres wrote:
>> WHAT: Changes to MPI layer modex API
>> 
>> WHY: To be mo' betta scalable
>> 
>> WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that
>> calls ompi_modex_send() and/or ompi_modex_recv()
>> 
>> TIMEOUT: COB Fri 4 Apr 2008
>> 
>> DESCRIPTION:
>> 
>>   
> [...snip...]
>>   * int ompi_modex_node_send(...): send modex data that is relevant
>> for all processes in this job on this node.  It is intended that only
>> one process in a job on a node will call this function.  If more than
>> one process in a job on a node calls _node_send(), then only one will
>> "win" (meaning that the data sent by the others will be overwritten).
>> 
>>   
>>   * int ompi_modex_node_recv(...): receive modex data that is relevant
>> for a whole peer node; receive the ["winning"] blob sent by
>> _node_send() from the source node.  We haven't yet decided what the
>> node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would
>> figure out what node the (ompi_proc_t*) resides on and then give you
>> the data).
>> 
>>   
> The above sounds like there could be race conditions if more than one
> process on a node is doing
> ompi_modex_node_send.  That is are you really going to be able to be
> assured when ompi_modex_node_recv
> is done that one of the processes is not in the middle of doing
> ompi_modex_node_send?  I assume
> there must be some sort of gate that allows you to make sure no one is
> in the middle of overwriting your data.

The nature of the modex actually precludes this. The modex is implemented as
a barrier, so the timing actually looks like this:

1. each proc registers its modex_node[proc]_send calls early in MPI_Init.
All this does is collect the data locally in a buffer

2. each proc hits the orte_grpcomm.modex call in MPI_Init. At this point,
the collected data is sent to the local daemon. The proc "barriers" at this
point and can go no further until the modex is completed.

3. when the daemon detects that all local procs have sent it a modex buffer,
it enters an "allgather" operation across all daemons. When that operation
completes, each daemon has a complete modex buffer spanning the job.

4. each daemon "drops" the collected buffer into each local proc

5. each proc, upon receiving the modex buffer, decodes it and sets up its
data structs to respond to future modex_recv calls. Once that is completed,
the proc returns from the orte_grpcomm.modex call and is released from the
"barrier".


So we resolve the race condition by including a "barrier" inside the modex.
This is the current behavior as well - so this represents no change, just a
different organization of the modex'd data.

> 
> --td
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Gleb Natapov
On Wed, Apr 02, 2008 at 10:35:03AM -0400, Jeff Squyres wrote:
> If we use carto to limit hcas/ports are used on a given host on a per- 
> proc basis, then we can include some proc_send data to say "this proc  
> only uses indexes X,Y,Z from the node data".  The indexes can be  
> either uint8_ts, or maybe even a variable length bitmap.
> 
So you propose that each proc will send info (using node_send()) about every
hca/proc on a host even about those that are excluded from use by the proc
just in case? And then each proc will have to send additional info (using
proc_send() this time) to indicate what hcas/ports it is actually using?

--
Gleb.


Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Tim Prins
Is there a reason to rename ompi_modex_{send,recv} to 
ompi_modex_proc_{send,recv}? It seems simpler (and no more confusing and 
less work) to leave the names alone and add ompi_modex_node_{send,recv}.


Another question: Does the receiving process care that the information 
received applies to a whole node? I ask because maybe we could get the 
same effect by simply adding a parameter to ompi_modex_send which 
specifies if the data applies to just the proc or a whole node.


So, if we have ranks 1 & 2 on n1, and rank 3 on n2, then rank 1 would do:
ompi_modex_send("arch", arch, );
then rank 3 would do:
ompi_modex_recv(rank 1, "arch");
ompi_modex_recv(rank 2, "arch");

I don't really care either way, just wanted to throw out the idea.

Tim

Jeff Squyres wrote:

WHAT: Changes to MPI layer modex API

WHY: To be mo' betta scalable

WHERE: ompi/mpi/runtime/ompi_module_exchange.* and everywhere that  
calls ompi_modex_send() and/or ompi_modex_recv()


TIMEOUT: COB Fri 4 Apr 2008

DESCRIPTION:

Per some of the scalability discussions that have been occurring (some  
on-list and some off-list), and per the e-mail I sent out last week  
about ongoing work in the openib BTL, Ralph and I put together a loose  
proposal this morning to make the modex more scalable.  The timeout is  
fairly short because Ralph wanted to start implementing in the near  
future, and we didn't anticipate that this would be a contentious  
proposal.


The theme is to break the modex into two different kinds of data:

- Modex data that is specific to a given proc
- Modex data that is applicable to all procs on a given node

For example, in the openib BTL, the majority of modex data is  
applicable to all processes on the same node (GIDs and LIDs and  
whatnot).  It is much more efficient to send only one copy of such  
node-specific data to each process (vs. sending ppn copies to each  
process).  The spreadsheet I included in last week's e-mail clearly  
shows this.


1. Add new modex API functions.  The exact function signatures are  
TBD, but they will be generally of the form:


  * int ompi_modex_proc_send(...): send modex data that is specific to  
this process.  It is just about exactly the same as the current API  
call (ompi_modex_send).


  * int ompi_modex_proc_recv(...): receive modex data from a specified  
peer process (indexed on ompi_proc_t*).  It is just about exactly the  
same as the current API call (ompi_modex_recv).


  * int ompi_modex_node_send(...): send modex data that is relevant  
for all processes in this job on this node.  It is intended that only  
one process in a job on a node will call this function.  If more than  
one process in a job on a node calls _node_send(), then only one will  
"win" (meaning that the data sent by the others will be overwritten).


  * int ompi_modex_node_recv(...): receive modex data that is relevant  
for a whole peer node; receive the ["winning"] blob sent by  
_node_send() from the source node.  We haven't yet decided what the  
node index will be; it may be (ompi_proc_t*) (i.e., _node_recv() would  
figure out what node the (ompi_proc_t*) resides on and then give you  
the data).


2. Make the existing modex API calls (ompi_modex_send,  
ompi_modex_recv) be wrappers around the new "proc" send/receive  
calls.  This will provide exactly the same functionality as the  
current API (but be sub-optimal at scale).  It will give BTL authors  
(etc.) time to update to the new API, potentially taking advantage of  
common data across multiple processes on the same node.  We'll likely  
put in some opal_output()'s in the wrappers to help identify code that  
is still calling the old APIs.


3. Remove the old API calls (ompi_modex_send, ompi_modex_recv) before  
v1.3 is released.






Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Jeff Squyres

On Apr 2, 2008, at 10:27 AM, Gleb Natapov wrote:
In the case of openib BTL what part of modex are you going to send  
using

proc_send() and what part using node_send()?



In the /tmp-public/openib-cpc2 branch, almost all of it will go to the  
node_send().  The CPC's will likely now get 2 buffers: one for  
node_send, and one for proc_send.


The ibcm CPC, for example, can do everything in node_send (the  
service_id that I use in the ibcm calls is the proc's PID; ORTE may  
supply peer PIDs directly -- haven't decided if that's a good idea yet  
or not -- if it doesn't, the PID can be sent in the proc_send data).   
The rdmacm CPC may need a proc_send for the listening TCP port number;  
still need to figure that one out.


If we use carto to limit hcas/ports are used on a given host on a per- 
proc basis, then we can include some proc_send data to say "this proc  
only uses indexes X,Y,Z from the node data".  The indexes can be  
either uint8_ts, or maybe even a variable length bitmap.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] RFC: changes to modex

2008-04-02 Thread Gleb Natapov
On Wed, Apr 02, 2008 at 10:21:12AM -0400, Jeff Squyres wrote:
>   * int ompi_modex_proc_send(...): send modex data that is specific to  
> this process.  It is just about exactly the same as the current API  
> call (ompi_modex_send).
> 
[skip]
> 
>   * int ompi_modex_node_send(...): send modex data that is relevant  
> for all processes in this job on this node.  It is intended that only  
> one process in a job on a node will call this function.  If more than  
> one process in a job on a node calls _node_send(), then only one will  
> "win" (meaning that the data sent by the others will be overwritten).
> 
In the case of openib BTL what part of modex are you going to send using
proc_send() and what part using node_send()?

--
Gleb.


[OMPI devel] [PATCH] Fix compilation error without XRC

2008-04-02 Thread Bernhard Fischer
Hi,

* ompi/mca/btl/openib/btl_openib_component.c (init_one_hca):
mca_btl_openib_open_xrc_domain and
mca_btl_openib_close_xrc_domain depend on XRC

Fixes the compilation failure as in the head of attached patch.
TIA,
Bernhard
CXX -g -finline-functions -o .libs/ompi_info components.o ompi_info.o output.o param.o version.o -Wl,--export-dynamic  ../../../ompi/.libs/libmpi.so -L/opt/infiniband/lib /opt/infiniband/lib/libibverbs.so -lpthread -lrt /home/bernhard/src/openmpi/ompi-trunk/orte/.libs/libopen-rte.so /home/bernhard/src/openmpi/ompi-trunk/opal/.libs/libopen-pal.so -ldl -lnuma -lnsl -lutil  -Wl,-rpath,/opt/libs//openmpi-1.3.0.a1.r18069-INTEL-10.1.013-64/lib -Wl,-rpath,/opt/infiniband/lib
../../../ompi/.libs/libmpi.so: undefined reference to `mca_btl_openib_close_xrc_domain'
../../../ompi/.libs/libmpi.so: undefined reference to `mca_btl_openib_open_xrc_domain'
make[2]: *** [ompi_info] Error 1

Index: ompi-trunk/ompi/mca/btl/openib/btl_openib_component.c
===
--- ompi-trunk/ompi/mca/btl/openib/btl_openib_component.c	(revision 18069)
+++ ompi-trunk/ompi/mca/btl/openib/btl_openib_component.c	(working copy)
@@ -1012,12 +1012,14 @@ static int init_one_hca(opal_list_t *btl
 goto error;
 }

+#if HAVE_XRC
 if (MCA_BTL_XRC_ENABLED) {
 if (OMPI_SUCCESS != mca_btl_openib_open_xrc_domain(hca)) {
 BTL_ERROR(("XRC Internal error. Failed to open xrc domain"));
 goto error;
 }
 }
+#endif

 mpool_resources.reg_data = (void*)hca;
 mpool_resources.sizeof_reg = sizeof(mca_btl_openib_reg_t);
@@ -1103,11 +1105,13 @@ error:
 #endif
 if(hca->mpool)
 mca_mpool_base_module_destroy(hca->mpool);
+#if HAVE_XRC
 if (MCA_BTL_XRC_ENABLED) {
 if(OMPI_SUCCESS != mca_btl_openib_close_xrc_domain(hca)) {
 BTL_ERROR(("XRC Internal error. Failed to close xrc domain"));
 }
 }
+#endif
 if(hca->ib_pd)
 ibv_dealloc_pd(hca->ib_pd);
 if(hca->ib_dev_context)


Re: [OMPI devel] --disable-ipv6 broken on trunk

2008-04-02 Thread Josh Hursey

Great. Thanks for the fix.

On Apr 2, 2008, at 6:54 AM, Adrian Knoth wrote:

On Wed, Apr 02, 2008 at 06:36:02AM -0400, Josh Hursey wrote:


It seems that builds configured with '--disable-ipv6' are broken on
the trunk. I suspect r18055 for this break since the tarball from two
---
oob_tcp.c: In function `mca_oob_tcp_fini':
oob_tcp.c:1364: error: structure has no member named `tcp6_listen_sd'
oob_tcp.c:1365: error: structure has no member named  
`tcp6_recv_event'

---
Can someone take a look at this?


Fixed in r18071. Thanks for observation.


--
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [OMPI devel] --disable-ipv6 broken on trunk

2008-04-02 Thread Adrian Knoth
On Wed, Apr 02, 2008 at 06:36:02AM -0400, Josh Hursey wrote:

> It seems that builds configured with '--disable-ipv6' are broken on  
> the trunk. I suspect r18055 for this break since the tarball from two  
> ---
> oob_tcp.c: In function `mca_oob_tcp_fini':
> oob_tcp.c:1364: error: structure has no member named `tcp6_listen_sd'
> oob_tcp.c:1365: error: structure has no member named `tcp6_recv_event'
> ---
> Can someone take a look at this?

Fixed in r18071. Thanks for observation.


-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de