Hi George,

Agreed -- I should have referenced the RFC that I sent out last year. Sorry 
about not reposting/explicitly mentioning the old RFC from about 5 months ago.

I'm willing to sit down with you and others so we can chat further about the 
change.

Ralph is correct -- the plan is to have only one rank per node send the 
information required for sm initialization and have the rest consume them.

If required, I'm willing to backout the commit until a better way is formulated.

Thanks,

Sam

________________________________________
From: devel-boun...@open-mpi.org [devel-boun...@open-mpi.org] on behalf of 
George Bosilca [bosi...@icl.utk.edu]
Sent: Friday, January 04, 2013 1:57 PM
To: de...@open-mpi.org
Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27739 - in trunk:       
ompi/mca/btl/sm ompi/mca/common/sm ompi/mca/mpool/sm    opal/mca/shmem 
opal/mca/shmem/mmap opal/mca/shmem/posix opal/mca/shmem/sysv 
opal/mca/shmem/windows

Sam,

This is a major change and would have deserved an RFC, as it impose a 
drastic/major non-scalable change (up to now the backend file creation was 
centralized, not in addition we exchange the data through the modex). A quick 
look highlight the fact that quite a lot of new modex entries have appeared 
after this patch. On a 4 proc (2x2) we got more than 20 entries each one of 
them up to 32 bytes (he list is attached at the end of this email).

Clearly this new approach is significantly less scalable compared with the old 
one. In the past we had issues adding one single integer per process, I fail to 
understand how our standards changed so much that now few hundreds bytes per 
process become acceptable. Moreover, what is the benefit this change provides 
in exchange of this loss of scalability?

  George.

PS: The exhaustive list of new SM-related modex entries:
[dancer01:01049] [[50563,1],0] db:hash:store: storing key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01720] [[50563,1],1] db:hash:store: storing pointer of key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],0]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],0]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01049] [[50563,1],0] db:hash:store: storing pointer of key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer02:01721] [[50563,1],3] db:hash:store: storing pointer of key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-0-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-0-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-1-0[OPAL_BYTE_OBJECT] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-1-1[OPAL_STRING] for proc [[50563,1],1]
[dancer01:01050] [[50563,1],2] db:hash:store: storing pointer of key 
btl.sm.1.9-2[OPAL_BYTE_OBJECT] for proc [[50563,1],1]


On Jan 3, 2013, at 22:52 , svn-commit-mai...@open-mpi.org wrote:

> Author: samuel (Samuel K. Gutierrez)
> Date: 2013-01-03 16:52:20 EST (Thu, 03 Jan 2013)
> New Revision: 27739
> URL: https://svn.open-mpi.org/trac/ompi/changeset/27739
>
> Log:
> sm BTL initialization via modex, as discussed at last year's meeting.
>
> Text files modified:
>   trunk/ompi/mca/btl/sm/btl_sm.c                      |   337 
> +++++++++++++++++++++--------
>   trunk/ompi/mca/btl/sm/btl_sm.h                      |    60 +++++
>   trunk/ompi/mca/btl/sm/btl_sm_component.c            |   444 
> ++++++++++++++++++++++++++++++++++++++-
>   trunk/ompi/mca/btl/sm/help-mpi-btl-sm.txt           |     6
>   trunk/ompi/mca/common/sm/common_sm.c                |    92 +++++--
>   trunk/ompi/mca/common/sm/common_sm.h                |    45 +++
>   trunk/ompi/mca/mpool/sm/mpool_sm.h                  |    17
>   trunk/ompi/mca/mpool/sm/mpool_sm_component.c        |   111 ++++-----
>   trunk/opal/mca/shmem/mmap/shmem_mmap_module.c       |     7
>   trunk/opal/mca/shmem/posix/shmem_posix_module.c     |     9
>   trunk/opal/mca/shmem/shmem_types.h                  |    36 ++
>   trunk/opal/mca/shmem/sysv/shmem_sysv_module.c       |    11
>   trunk/opal/mca/shmem/windows/shmem_windows_module.c |     7
>   13 files changed, 933 insertions(+), 249 deletions(-)


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to