Thanks for the fix. I can confirm that the trunk is working again with the unity component.

I agree that I should support the 'tree' component, but I probably won't be able to get to it for another couple of weeks.

Thanks again,
Josh

On Apr 9, 2008, at 10:51 PM, Ralph Castain wrote:

Okay, the irony here is truly humorous. This took several hours to chase
down.

As you may recall, we had an earlier problem with the unity routed module where I gave you a couple of options for repairing it. Well, it turned out that the latest changes obviated the need for that hack...and so the hack
caused the system to fail.

So, having now removed the prior hack required to keep the module alive, you
should find it happy again!

BTW: it isn't that the unity module is such a pain in itself. The problem
lies in our efforts to shift data movement to the daemon level for
scalability, versus the inherent "everything happens directly between the apps" approach of the unity module. As we move more and more things to the daemon level, we are achieving the scalability we want - it just makes it harder to find a way to blend the conflicting approach in unity so it can
keep running.

I believe we have now reached a point, though, where it may now be easier to keep that module alive. Everything we need to shift to the daemons has now been shifted, so I don't believe unity is going to present as much of a
problem going forward.

I still think it would be good for you to get C/R to work with non- unity
routed modules for scalability reasons - unity is still inherently
non-scalable. But hopefully it won't be as much of a roller-coaster for you
as we go forward.

Thanks for the patience
Ralph


On 4/9/08 5:15 PM, "Ralph Castain" <r...@lanl.gov> wrote:

Groan...yes, will look at it this evening and get it fixed as quickly as I
can.

Sorry...like I said, unity is getting harder and harder to keep alive. :-/

Ralph


On 4/9/08 5:01 PM, "Josh Hursey" <jjhur...@open-mpi.org> wrote:

Ralph,

It seems that the 'unity' component of the routed framework is broken
as a result of this commit. :(

Any chance you can take a look at this?

Thanks,
Josh

On Apr 9, 2008, at 6:10 PM, r...@osl.iu.edu wrote:
Author: rhc
Date: 2008-04-09 18:10:53 EDT (Wed, 09 Apr 2008)
New Revision: 18115
URL: https://svn.open-mpi.org/trac/ompi/changeset/18115

Log:
Fully implement the inbound binomial allgather for daemon-based
collectives. Supports both modex and barrier operations.

Comm_spawn still uses the rank=0 method - shifting that algo to the
daemons is under study.


Removed:
 trunk/orte/mca/grpcomm/base/grpcomm_base_barrier.c
 trunk/orte/mca/grpcomm/exp/
Text files modified:
trunk/ompi/mca/pml/ob1/pml_ob1.c | 1 trunk/orte/mca/ess/hnp/ess_hnp_module.c | 2 trunk/orte/mca/grpcomm/base/Makefile.am | 1 trunk/orte/mca/grpcomm/base/base.h | 3
 trunk/orte/mca/grpcomm/base/grpcomm_base_allgather.c         |
253 -----------
trunk/orte/mca/grpcomm/basic/grpcomm_basic_component.c | 4
 trunk/orte/mca/grpcomm/basic/grpcomm_basic_module.c          |
832 ++++++++++++++++++++++++++++++++++-----
trunk/orte/mca/grpcomm/cnos/grpcomm_cnos_module.c | 8
 trunk/orte/mca/grpcomm/grpcomm.h                             |
27 +
trunk/orte/mca/grpcomm/grpcomm_types.h | 8 trunk/orte/mca/odls/base/odls_base_close.c | 1
 trunk/orte/mca/odls/base/odls_base_default_fns.c             |
131 ++++-
 trunk/orte/mca/odls/base/odls_base_open.c                    |
24 +
trunk/orte/mca/odls/base/odls_private.h | 16 trunk/orte/mca/plm/base/plm_base_launch_support.c | 7 trunk/orte/mca/rmaps/base/rmaps_base_map_job.c | 1 trunk/orte/mca/rmaps/base/rmaps_base_open.c | 4
 trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c           |
186 +-------
trunk/orte/mca/rmaps/base/rmaps_private.h | 2 trunk/orte/mca/rmaps/rank_file/rmaps_rank_file.c | 2
 trunk/orte/mca/rmaps/rmaps_types.h                           |
28 +
trunk/orte/mca/rmaps/round_robin/rmaps_rr.c | 8 trunk/orte/mca/rmaps/seq/rmaps_seq.c | 2 trunk/orte/mca/rml/rml_types.h | 36
 trunk/orte/orted/orted_comm.c                                |
43 +-
trunk/orte/runtime/data_type_support/orte_dt_copy_fns.c | 2 trunk/orte/runtime/data_type_support/orte_dt_packing_fns.c | 4 trunk/orte/runtime/data_type_support/orte_dt_print_fns.c | 4 trunk/orte/runtime/data_type_support/orte_dt_unpacking_fns.c | 4 trunk/orte/runtime/orte_globals.c | 3 trunk/orte/runtime/orte_globals.h | 1 trunk/orte/runtime/orte_globals_class_instances.h | 2
 32 files changed, 1019 insertions(+), 631 deletions(-)


Diff not shown due to size (106446 bytes).
To see the diff, run the following command:

svn diff -r 18114:18115 --no-diff-deleted

_______________________________________________
svn mailing list
s...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn



_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to