Re: [OMPI users] Enforcing specific interface and subnet usage
Sorry for late response. But I just wanted to inform you that I found another workaround, unrelated to the method we discussed here. On 19/06/18 15:26, r...@open-mpi.org wrote: The OMPI cmd line converts "--mca ptl_tcp_remote_connections 1” to OMPI_MCA_ ptl_tcp_remote_connections, which is not recognized by PMIx. PMIx is looking for PMIX_MCA_ptl_tcp_remote_connections. The only way to set PMIx MCA params for the code embedded in OMPI is to put them in your environment On Jun 19, 2018, at 2:08 AM, Maksym Planeta wrote: But what about remote connections parameter? Why is it not set? On 19/06/18 00:58, r...@open-mpi.org wrote: I’m not entirely sure I understand what you are trying to do. The PMIX_SERVER_URI2 envar tells local clients how to connect to their local PMIx server (i.e., the OMPI daemon on that node). This is always done over the loopback device since it is a purely local connection that is never used for MPI messages. I’m sure that the tcp/btl is using your indicated subnet as that would be used for internode messages. -- Regards, Maksym Planeta ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users -- Regards, Maksym Planeta ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Enforcing specific interface and subnet usage
The OMPI cmd line converts "--mca ptl_tcp_remote_connections 1” to OMPI_MCA_ ptl_tcp_remote_connections, which is not recognized by PMIx. PMIx is looking for PMIX_MCA_ptl_tcp_remote_connections. The only way to set PMIx MCA params for the code embedded in OMPI is to put them in your environment > On Jun 19, 2018, at 2:08 AM, Maksym Planeta > wrote: > > But what about remote connections parameter? Why is it not set? > > On 19/06/18 00:58, r...@open-mpi.org wrote: >> I’m not entirely sure I understand what you are trying to do. The >> PMIX_SERVER_URI2 envar tells local clients how to connect to their local >> PMIx server (i.e., the OMPI daemon on that node). This is always done over >> the loopback device since it is a purely local connection that is never used >> for MPI messages. >> I’m sure that the tcp/btl is using your indicated subnet as that would be >> used for internode messages. > -- > Regards, > Maksym Planeta > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Enforcing specific interface and subnet usage
What is exactly the issue you are facing ? You also need to force the subnet used by oob/tcp mpirun —mca oob_tcp_if_include 10.233.0.0/19 ... iirc, Open MPI might discard addresses from a bridge interface, but I do not exactly remember if it affects both btl/tcp and/or oob/tcp and/or none by default. Cheers, Gilles On Tuesday, June 19, 2018, Maksym Planeta wrote: > Hello, > > I want to force OpenMPI to use TCP and in particular use a particular > subnet. Unfortunately, I can't manage to do that. > > Here is what I try: > > $BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca > ptl_tcp_remote_connections 1 --mca btl_tcp_if_include '10.233.0.0/19' -np > 4 --oversubscribe -H ib1n,ib2n bash -c 'echo $PMIX_SERVER_URI2' > > The expected result would be a list of IP addresses in 10.233.0.0 subnet, > but instead I get this: > > 2659516416.2;tcp4://127.0.0.1:46777 > 2659516416.2;tcp4://127.0.0.1:46777 > 2659516416.1;tcp4://127.0.0.1:45055 > 2659516416.1;tcp4://127.0.0.1:45055 > > Could you help me to debug this problem somehow? > > The IP addresses are completely available in the desired subnet > > $BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca > ptl_tcp_remote_connections 1 --mca btl_tcp_if_include '10.233.0.0/19' -np > 4 --oversubscribe -H ib1n,ib2n ip addr show dev br0 > > Returns a set of bridges looking like: > > 9: br0: mtu 1500 qdisc noqueue state UP > group default qlen 1000 > link/ether 94:de:80:ba:37:e4 brd ff:ff:ff:ff:ff:ff > inet 141.76.49.17/26 brd 141.76.49.63 scope global br0 >valid_lft forever preferred_lft forever > inet 10.233.0.82/19 scope global br0 >valid_lft forever preferred_lft forever > inet6 2002:8d4c:3001:48:40de:80ff:feba:37e4/64 scope global > deprecated mngtmpaddr dynamic >valid_lft 59528sec preferred_lft 0sec > inet6 fe80::96de:80ff:feba:37e4/64 scope link tentative dadfailed >valid_lft forever preferred_lft forever > > > What is more boggling is that if I attache with a debugger at > opal/mca/pmix/pmix3x/pmix/src/mca/ptl/tcp/ptl_tcp_components.c around > line 500 I see that mca_ptl_tcp_component.remote_connections is false. > This means that the way I set up component parameters is ignored. > > -- > Regards, > Maksym Planeta > > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Enforcing specific interface and subnet usage
But what about remote connections parameter? Why is it not set? On 19/06/18 00:58, r...@open-mpi.org wrote: I’m not entirely sure I understand what you are trying to do. The PMIX_SERVER_URI2 envar tells local clients how to connect to their local PMIx server (i.e., the OMPI daemon on that node). This is always done over the loopback device since it is a purely local connection that is never used for MPI messages. I’m sure that the tcp/btl is using your indicated subnet as that would be used for internode messages. -- Regards, Maksym Planeta smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Enforcing specific interface and subnet usage
I’m not entirely sure I understand what you are trying to do. The PMIX_SERVER_URI2 envar tells local clients how to connect to their local PMIx server (i.e., the OMPI daemon on that node). This is always done over the loopback device since it is a purely local connection that is never used for MPI messages. I’m sure that the tcp/btl is using your indicated subnet as that would be used for internode messages. > On Jun 18, 2018, at 3:52 PM, Maksym Planeta > wrote: > > Hello, > > I want to force OpenMPI to use TCP and in particular use a particular subnet. > Unfortunately, I can't manage to do that. > > Here is what I try: > > $BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca ptl_tcp_remote_connections > 1 --mca btl_tcp_if_include '10.233.0.0/19' -np 4 --oversubscribe -H > ib1n,ib2n bash -c 'echo $PMIX_SERVER_URI2' > > The expected result would be a list of IP addresses in 10.233.0.0 subnet, but > instead I get this: > > 2659516416.2;tcp4://127.0.0.1:46777 > 2659516416.2;tcp4://127.0.0.1:46777 > 2659516416.1;tcp4://127.0.0.1:45055 > 2659516416.1;tcp4://127.0.0.1:45055 > > Could you help me to debug this problem somehow? > > The IP addresses are completely available in the desired subnet > > $BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca > ptl_tcp_remote_connections 1 --mca btl_tcp_if_include '10.233.0.0/19' -np 4 > --oversubscribe -H ib1n,ib2n ip addr show dev br0 > > Returns a set of bridges looking like: > > 9: br0: mtu 1500 qdisc noqueue state UP > group default qlen 1000 >link/ether 94:de:80:ba:37:e4 brd ff:ff:ff:ff:ff:ff >inet 141.76.49.17/26 brd 141.76.49.63 scope global br0 > valid_lft forever preferred_lft forever >inet 10.233.0.82/19 scope global br0 > valid_lft forever preferred_lft forever >inet6 2002:8d4c:3001:48:40de:80ff:feba:37e4/64 scope global deprecated > mngtmpaddr dynamic > valid_lft 59528sec preferred_lft 0sec >inet6 fe80::96de:80ff:feba:37e4/64 scope link tentative dadfailed > valid_lft forever preferred_lft forever > > > What is more boggling is that if I attache with a debugger at > opal/mca/pmix/pmix3x/pmix/src/mca/ptl/tcp/ptl_tcp_components.c around line > 500 I see that mca_ptl_tcp_component.remote_connections is false. This means > that the way I set up component parameters is ignored. > > -- > Regards, > Maksym Planeta > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] Enforcing specific interface and subnet usage
Hello, I want to force OpenMPI to use TCP and in particular use a particular subnet. Unfortunately, I can't manage to do that. Here is what I try: $BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca ptl_tcp_remote_connections 1 --mca btl_tcp_if_include '10.233.0.0/19' -np 4 --oversubscribe -H ib1n,ib2n bash -c 'echo $PMIX_SERVER_URI2' The expected result would be a list of IP addresses in 10.233.0.0 subnet, but instead I get this: 2659516416.2;tcp4://127.0.0.1:46777 2659516416.2;tcp4://127.0.0.1:46777 2659516416.1;tcp4://127.0.0.1:45055 2659516416.1;tcp4://127.0.0.1:45055 Could you help me to debug this problem somehow? The IP addresses are completely available in the desired subnet $BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca ptl_tcp_remote_connections 1 --mca btl_tcp_if_include '10.233.0.0/19' -np 4 --oversubscribe -H ib1n,ib2n ip addr show dev br0 Returns a set of bridges looking like: 9: br0: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 94:de:80:ba:37:e4 brd ff:ff:ff:ff:ff:ff inet 141.76.49.17/26 brd 141.76.49.63 scope global br0 valid_lft forever preferred_lft forever inet 10.233.0.82/19 scope global br0 valid_lft forever preferred_lft forever inet6 2002:8d4c:3001:48:40de:80ff:feba:37e4/64 scope global deprecated mngtmpaddr dynamic valid_lft 59528sec preferred_lft 0sec inet6 fe80::96de:80ff:feba:37e4/64 scope link tentative dadfailed valid_lft forever preferred_lft forever What is more boggling is that if I attache with a debugger at opal/mca/pmix/pmix3x/pmix/src/mca/ptl/tcp/ptl_tcp_components.c around line 500 I see that mca_ptl_tcp_component.remote_connections is false. This means that the way I set up component parameters is ignored. -- Regards, Maksym Planeta smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users