Gabriel,

Awesome, good luck.

I have no idea which are or are not necessary for a proper functioning
daemon. To me all of the ones you have here seem critical. Ralph would be a
better source of information regarding the daemons' requirements.

Thanks,
  George.


On Tue, Mar 9, 2021 at 10:25 AM Gabriel Tanase <gabrieltan...@gmail.com>
wrote:

> George,
> I started to digg more in option 2 as you describe it. I believe I can
> make that work.
> For example I created this fake ssh :
>
> $ cat ~/bin/ssh
> #!/bin/bash
> fname=env.$$
> echo ">>>>>>>>>>>>> ssh" >> $fname
> env >>$fname
> echo ">>>>>>>>>>>>>>>>>>>>>>>>>>>" >>$fname
> echo $@ >>$fname
>
> And this one prints all args that the remote process will receive:
>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> -x 10.0.35.43 orted -mca ess "env" -mca ess_base_jobid "2752512000" -mca
> ess_base_vpid 1 -mca ess_base_num_procs "3" -mca orte_node_regex
> "ip-[2:10]-0-16-120,[2:10].0.35.43,[2:10].0.35.42@0(3)" -mca orte_hnp_uri
> "2752512000.0;tcp://10.0.16.120:44789" -mca plm "rsh" --tree-spawn -mca
> routed "radix" -mca orte_parent_uri "2752512000.0;tcp://10.0.16.120:44789"
> -mca rmaps_base_mapping_policy "node" -mca pmix "^s1,s2,cray,isolated"
>
> Now I am thinking that probably I don;t even need to create all those
> openmpi env variables as I am hoping orted that will be started remotely
> will start the final executable with the right env set. does this sound
> right ?
>
> Thx,
> --Gabriel
>
>
> On Fri, Mar 5, 2021 at 3:15 PM George Bosilca <bosi...@icl.utk.edu> wrote:
>
>> Gabriel,
>>
>> You should be able to. Here are at least 2 different ways of doing this.
>>
>> 1. Purely MPI. Start singletons (or smaller groups), and connect via
>> sockets using MPI_Comm_join. You can setup your own DNS-like service, with
>> the goal of having the independent MPI jobs leave a trace there, such that
>> they can find each other and create the initial socket.
>>
>> 2. You could replace ssh/rsh with a no-op script (that returns success
>> such that the mpirun process thinks it successfully started the processes),
>> and then handcraft the environment as you did for GASNet.
>>
>> 3. We have support for DVM (Distributed Virtual Machine) that basically
>> created an independent service where different mpirun could connect to
>> retrieve information. The mpirun using this dvm singleton, and fallback to
>> MPI_Comm_connect/accept to recreate an MPI world.
>>
>> Good luck,
>>   George.
>>
>>
>> On Fri, Mar 5, 2021 at 2:08 PM Ralph Castain via devel <
>> devel@lists.open-mpi.org> wrote:
>>
>>> I'm afraid that won't work - there is no way for the job to "self
>>> assemble". One could create a way to do it, but it would take some
>>> significant coding in the guts of OMPI to get there.
>>>
>>>
>>> On Mar 5, 2021, at 9:40 AM, Gabriel Tanase via devel <
>>> devel@lists.open-mpi.org> wrote:
>>>
>>> Hi all,
>>> I decided to use mpi as the messaging layer for a multihost database.
>>> However within my org I faced very strong opposition to allow passwordless
>>> ssh or rsh. For security reasons we want to minimize the opportunities to
>>> execute arbitrary codes on the db clusters. I don;t want to run other
>>> things like slurm, etc.
>>>
>>> My question would be: Is there a way to start an mpi application by
>>> running certain binaries on each host ? E.g., if my executable is "myapp"
>>> can I start a server (orted???) on host zero and than start myapp on each
>>> host with the right env variables set (for specifying the rank, num ranks,
>>> etc.)
>>>
>>> For example when using another messaging API (GASnet) I was able to
>>> start a server on host zero and then manually start the application binary
>>> on each host (with some environment variables properly set) and all was
>>> good.
>>>
>>> I tried to reverse engineer a little the env variables used by mpirun
>>> (mpirun -np 2 env) and then I copied these env variables in a shell script
>>> prior to invoking my hello world mpirun but I got an error message implying
>>> a server is not present:
>>>
>>> PMIx_Init failed for the following reason:
>>>
>>>   NOT-SUPPORTED
>>>
>>> Open MPI requires access to a local PMIx server to execute. Please ensure
>>> that either you are operating in a PMIx-enabled environment, or use
>>> "mpirun"
>>> to execute the job.
>>>
>>> Here is the shell script for host0:
>>>
>>> $ cat env1.sh
>>> #!/bin/bash
>>>
>>> export OMPI_COMM_WORLD_RANK=0
>>> export PMIX_NAMESPACE=mpirun-38f9d3525c2c-53291@1
>>> export PRTE_MCA_prte_base_help_aggregate=0
>>> export TERM_PROGRAM=Apple_Terminal
>>> export OMPI_MCA_num_procs=2
>>> export TERM=xterm-256color
>>> export SHELL=/bin/bash
>>> export PMIX_VERSION=4.1.0a1
>>> export OPAL_USER_PARAMS_GIVEN=1
>>> export TMPDIR=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T/
>>> export
>>> Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.HCXmdRI1WL/Render
>>> export PMIX_SERVER_URI41=mpirun-38f9d3525c2c-53291@0.0;tcp4://
>>> 192.168.0.180:52093
>>> export TERM_PROGRAM_VERSION=421.2
>>> export PMIX_RANK=0
>>> export TERM_SESSION_ID=18212D82-DEB2-4AE8-A271-FB47AC71337B
>>> export OMPI_COMM_WORLD_LOCAL_RANK=0
>>> export OMPI_ARGV=
>>> export OMPI_MCA_initial_wdir=/Users/igtanase/ompi
>>> export USER=igtanase
>>> export OMPI_UNIVERSE_SIZE=2
>>> export SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.PhcplcX3pC/Listeners
>>> export OMPI_COMMAND=./exe
>>> export __CF_USER_TEXT_ENCODING=0x54984577:0x0:0x0
>>> export
>>> OMPI_FILE_LOCATION=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T//prte.38f9d3525c2c.1419265399/dvm.53291/1/0
>>> export PMIX_SERVER_URI21=mpirun-38f9d3525c2c-53291@0.0;tcp4://
>>> 192.168.0.180:52093
>>> export
>>> PATH=/Users/igtanase/ompi/bin/:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
>>> export OMPI_COMM_WORLD_LOCAL_SIZE=2
>>> export PRTE_MCA_pmix_session_server=1
>>> export PWD=/Users/igtanase/ompi
>>> export OMPI_COMM_WORLD_SIZE=2
>>> export OMPI_WORLD_SIZE=2
>>> export LANG=en_US.UTF-8
>>> export XPC_FLAGS=0x0
>>> export PMIX_GDS_MODULE=hash
>>> export XPC_SERVICE_NAME=0
>>> export HOME=/Users/igtanase
>>> export SHLVL=2
>>> export PMIX_SECURITY_MODE=native
>>> export PMIX_HOSTNAME=38f9d3525c2c
>>> export LOGNAME=igtanase
>>> export OMPI_WORLD_LOCAL_SIZE=2
>>> export PMIX_BFROP_BUFFER_TYPE=PMIX_BFROP_BUFFER_NON_DESC
>>> export PRTE_LAUNCHED=1
>>> export
>>> PMIX_SERVER_TMPDIR=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T//prte.38f9d3525c2c.1419265399/dvm.53291
>>> export OMPI_COMM_WORLD_NODE_RANK=0
>>> export OMPI_MCA_cpu_type=x86_64
>>> export
>>> PMIX_SYSTEM_TMPDIR=/var/folders/_k/c4_xr5vd14j97fw7j8vzmd45_9hjbq/T/
>>> export PMIX_SERVER_URI4=mpirun-38f9d3525c2c-53291@0.0;tcp4://
>>> 192.168.0.180:52093
>>> export OMPI_NUM_APP_CTX=1
>>> export SECURITYSESSIONID=186a9
>>> export PMIX_SERVER_URI3=mpirun-38f9d3525c2c-53291@0.0;tcp4://
>>> 192.168.0.180:52093
>>> export PMIX_SERVER_URI2=mpirun-38f9d3525c2c-53291@0.0;tcp4://
>>> 192.168.0.180:52093
>>> export _=/usr/bin/env
>>>
>>> ./exe
>>>
>>> Thx for your help,
>>> --Gabriel
>>>
>>>
>>>

Reply via email to