add_procs is always called at least once. This is how we set up shared
memory communication. It will then be invoked on-demand for non-local
peers with the reachability argument set to NULL (because the bitmask
doesn't provide any benefit when adding only 1 peer).

-Nathan

On Tue, May 17, 2016 at 12:00:38AM +0900, Gilles Gouaillardet wrote:
>    Jeff,
>    this is not what I observed
>    (tcp btl, 2 to 4 nodes with one task per node, cutoff=0)
>    the add_procs of the tcp btl is invoked once with the 4 tasks.
>    I checked the sources and found cutoff only controls if the modex is
>    invoked once for all at init, or on demand.
>    Cheers,
>    Gilles
> 
>    On Monday, May 16, 2016, Jeff Squyres (jsquyres) <jsquy...@cisco.com>
>    wrote:
> 
>      We changed the way BTL add_procs is invoked on master and v2.x for
>      scalability reasons.
> 
>      In short: add_procs is only invoked the first time you talk to a given
>      peer.  The cutoff switch is an override to that -- if the sizeof
>      COMM_WORLD is less than the cutoff, we revert to the old behavior of
>      calling add_procs for all procs.
> 
>      As for why one BTL would be chosen over another, be sure to look at not
>      only the priority of the component/module, but also the exclusivity
>      level.  In short, only BTLs with the same exclusivity level will be
>      considered (e.g., this is how we exclude TCP when using HPC-class
>      networks), and then the BTL modules with the highest priority will be
>      used for a given peer.
> 
>      > On May 16, 2016, at 7:19 AM, Gilles Gouaillardet
>      <gilles.gouaillar...@gmail.com> wrote:
>      >
>      > it seems I misunderstood some things ...
>      >
>      > add_procs is always invoked, regardless the cutoff value.
>      > cutoff is used to retrieve processes info via the modex "on demand" vs
>      at init time.
>      >
>      > Please someone correct me and/or elaborate if needed
>      >
>      > Cheers,
>      >
>      > Gilles
>      >
>      > On Monday, May 16, 2016, Gilles Gouaillardet <gil...@rist.or.jp>
>      wrote:
>      > i cannot reproduce this behavior.
>      >
>      > note mca_btl_tcp_add_procs is invoked once per tcp component (e.g.
>      once per physical NIC)
>      >
>      > so you might want to explicitly select one nic
>      >
>      > mpirun --mca btl_tcp_if_include xxx ...
>      >
>      > my printf output are the same and regardless the mpi_add_procs_cutoff
>      value
>      >
>      >
>      > Cheers,
>      >
>      >
>      > Gilles
>      > On 5/16/2016 12:22 AM, dpchoudh . wrote:
>      >> Sorry, I accidentally pressed 'Send' before I was done writing the
>      last mail. What I wanted to ask was what is the parameter
>      mpi_add_procs_cutoff and why adding it seems to make a difference in the
>      code path but not in the end result of the program? How would it help me
>      debug my problem?
>      >>
>      >> Thank you
>      >> Durga
>      >>
>      >> The surgeon general advises you to eat right, exercise regularly and
>      quit ageing.
>      >>
>      >> On Sun, May 15, 2016 at 11:17 AM, dpchoudh . <dpcho...@gmail.com>
>      wrote:
>      >> Hello Gilles
>      >>
>      >> Setting -mca mpi_add_procs_cutoff 1024 indeed makes a difference to
>      the output, as follows:
>      >>
>      >> With -mca mpi_add_procs_cutoff 1024:
>      >> reachable =     0x1
>      >> (Note that add_procs was called once and the value of 'reachable is
>      correct')
>      >>
>      >> Without -mca mpi_add_procs_cutoff 1024
>      >> reachable =     0x0
>      >> reachable = NULL
>      >> reachable = NULL
>      >> (Note that add_procs() was caklled three times and the value of
>      'reachable' seems wrong.
>      >>
>      >> The program does run correctly in either case. The program listing is
>      as below (note that I have removed output from the program itself in the
>      above reporting.)
>      >>
>      >> The code that prints 'reachable' is as follows:
>      >>
>      >> if (reachable == NULL)
>      >>     printf("reachable = NULL\n");
>      >> else
>      >> {
>      >>     int i;
>      >>     printf("reachable = ");
>      >>     for (i = 0; i < reachable->array_size; i++)
>      >>     printf("\t0x%llu", reachable->bitmap[i]);
>      >>     printf("\n\n");
>      >> }
>      >> return OPAL_SUCCESS;
>      >>
>      >> And the code for the test program is as follows:
>      >>
>      >> #include <mpi.h>
>      >> #include <stdio.h>
>      >> #include <string.h>
>      >> #include <stdlib.h>
>      >>
>      >> int main(int argc, char *argv[])
>      >> {
>      >>     int world_size, world_rank, name_len;
>      >>     char hostname[MPI_MAX_PROCESSOR_NAME], buf[8];
>      >>
>      >>     MPI_Init(&argc, &argv);
>      >>     MPI_Comm_size(MPI_COMM_WORLD, &world_size);
>      >>     MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
>      >>     MPI_Get_processor_name(hostname, &name_len);
>      >>     printf("Hello world from processor %s, rank %d out of %d
>      processors\n", hostname, world_rank, world_size);
>      >>     if (world_rank == 1)
>      >>     {
>      >>     MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD,
>      MPI_STATUS_IGNORE);
>      >>     printf("%s received %s, rank %d\n", hostname, buf, world_rank);
>      >>     }
>      >>     else
>      >>     {
>      >>     strcpy(buf, "haha!");
>      >>     MPI_Send(buf, 6, MPI_CHAR, 1, 99, MPI_COMM_WORLD);
>      >>     printf("%s sent %s, rank %d\n", hostname, buf, world_rank);
>      >>     }
>      >>     MPI_Barrier(MPI_COMM_WORLD);
>      >>     MPI_Finalize();
>      >>     return 0;
>      >> }
>      >>
>      >>
>      >>
>      >> The surgeon general advises you to eat right, exercise regularly and
>      quit ageing.
>      >>
>      >> On Sun, May 15, 2016 at 10:49 AM, Gilles Gouaillardet
>      <gilles.gouaillar...@gmail.com> wrote:
>      >> At first glance, that seems a bit odd...
>      >> are you sure you correctly print the reachable bitmap ?
>      >> I would suggest you add some instrumentation to understand what
>      happens
>      >> (e.g., printf before opal_bitmap_set_bit() and other places that
>      prevent this from happening)
>      >>
>      >> one more thing ...
>      >> now, master default behavior is
>      >> mpirun --mca mpi_add_procs_cutoff 0 ...
>      >> you might want to try
>      >> mpirun --mca mpi_add_procs_cutoff 1024 ...
>      >> and see if things make more sense.
>      >> if it helps, and iirc, there is a parameter so a btl can report it
>      does not support cutoff.
>      >>
>      >>
>      >> Cheers,
>      >>
>      >> Gilles
>      >>
>      >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com> wrote:
>      >> Hello Gilles
>      >>
>      >> Thanks for jumping in to help again. Actually, I had already tried
>      some of your suggestions before asking for help.
>      >>
>      >> I have several interconnects that can run both openib and tcp BTL. To
>      simplify things, I explicitly mentioned TCP:
>      >>
>      >> mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca btl self.tcp
>      ./mpitest
>      >>
>      >> where mpitest is a small program that does MPI_Send()/MPI_Recv() on a
>      small string, and then does an MPI_Barrier(). The program does work as
>      expected.
>      >>
>      >> I put a printf on the last line of mca_tcp_add_procs() to print the
>      value of 'reachable'. What I saw was that the value was always 0 when it
>      was invoked for Send()/Recv() and the pointer itself was NULL when
>      invoked for Barrier()
>      >>
>      >> Next I looked at pml_ob1_add_procs(), where the call chain starts,
>      and found that it initializes and passes an opal_bitmap_t reachable down
>      the call chain, but the resulting value is not used later in the code
>      (the memory is simply freed later).
>      >>
>      >> That, coupled with the fact that I am trying to imitate what the
>      other BTL implementations are doing, yet in
>      mca_bml_r2_endpoint_add_btl() by BTL is not being picked up, left me
>      puzzled. Please note that the interconnect that I am developing for is
>      on a different cluster (than where I ran the above test for TCP BTL.)
>      >>
>      >> Thanks again
>      >> Durga
>      >>
>      >> The surgeon general advises you to eat right, exercise regularly and
>      quit ageing.
>      >>
>      >> On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet
>      <gilles.gouaillar...@gmail.com> wrote:
>      >> did you check the add_procs callbacks ?
>      >> (e.g. mca_btl_tcp_add_procs() for the tcp btl)
>      >> this is where the reachable bitmap is set, and I guess this is what
>      you are looking for.
>      >>
>      >> keep in mind that if several btl can be used, the one with the higher
>      exclusivity is used
>      >> (e.g. tcp is never used if openib is available)
>      >> you can simply force your btl and self, and the ob1 pml, so you do
>      not have to worry about other btl exclusivity.
>      >>
>      >> Cheers,
>      >>
>      >> Gilles
>      >>
>      >>
>      >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com> wrote:
>      >> Hello all
>      >>
>      >> I have been struggling with this issue for a while and figured it
>      might be a good idea to ask for help.
>      >>
>      >> Where (in the code path) is the connectivity map created?
>      >>
>      >> I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but
>      obviously I am not setting it up right, because this routine is not
>      finding the BTL corresponding to my interconnect.
>      >>
>      >> Thanks in advance
>      >> Durga
>      >>
>      >> The surgeon general advises you to eat right, exercise regularly and
>      quit ageing.
>      >>
>      >> _______________________________________________
>      >> devel mailing list
>      >> de...@open-mpi.org
>      >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>      >> Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2016/05/18975.php
>      >>
>      >>
>      >> _______________________________________________
>      >> devel mailing list
>      >> de...@open-mpi.org
>      >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>      >> Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2016/05/18977.php
>      >>
>      >>
>      >>
>      >>
>      >> _______________________________________________
>      >> devel mailing list
>      >>
>      >> de...@open-mpi.org
>      >>
>      >> Subscription:
>      >> https://www.open-mpi.org/mailman/listinfo.cgi/devel
>      >>
>      >> Link to this post:
>      >> http://www.open-mpi.org/community/lists/devel/2016/05/18979.php
>      >
>      > _______________________________________________
>      > devel mailing list
>      > de...@open-mpi.org
>      > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>      > Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2016/05/18981.php
> 
>      --
>      Jeff Squyres
>      jsquy...@cisco.com
>      For corporate legal information go to:
>      http://www.cisco.com/web/about/doing_business/legal/cri/
> 
>      _______________________________________________
>      devel mailing list
>      de...@open-mpi.org
>      Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>      Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2016/05/18982.php

> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/05/18983.php

Attachment: pgpmz_9X18Lfv.pgp
Description: PGP signature

Reply via email to