Adam,

A couple questions. First, is seccomp the reason you think I have the
MPI_THREAD_MULTIPLE error? Or is it more for the vader error? If so, the
environment variable Nathan provided is probably enough. These are unit
tests and should execute in seconds at most (building them takes 10x-100x
more time).

But if it can help with the MPI_THREAD_MULTIPLE error, can you help
translate that to "Fortran programmer who really can only do docker
build/run/push/cp" for me? I found this page:
https://docs.docker.com/engine/security/seccomp/ that I'm trying to read
through and understand, but I'm mainly learning I should be looking at
taking some Docker training soon!

On Mon, Feb 24, 2020 at 8:24 PM Adam Simpson <asimp...@nvidia.com> wrote:

> Calls to process_vm_readv() and process_vm_writev() are disabled in the
> default Docker seccomp profile
> <https://github.com/moby/moby/blob/master/profiles/seccomp/default.json>.
> You can add the docker flag --cap-add=SYS_PTRACE or better yet modify the
> seccomp profile so that process_vm_readv and process_vm_writev are
> whitelisted, by adding them to the syscalls.names list.
>
> You can also disable seccomp, and several other confinement and security
> features, if you prefer a heavy handed approach:
>
> $ docker run --privileged --security-opt label=disable --security-opt
> seccomp=unconfined --security-opt apparmor=unconfined --ipc=host
> --network=host ...
>
> If you're still having trouble after fixing the above you may need to
> check yama on the host. You can check with "sysctl -w
> kernel.yama.ptrace_scope", if it returns a value other than 0 you may
> need to disable it with "sysctl -w kernel.yama.ptrace_scope=0".
>
> Adam
>
> ------------------------------
> *From:* users <users-boun...@lists.open-mpi.org> on behalf of Matt
> Thompson via users <users@lists.open-mpi.org>
> *Sent:* Monday, February 24, 2020 5:15 PM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Matt Thompson <fort...@gmail.com>
> *Subject:* Re: [OMPI users] Help with One-Sided Communication: Works in
> Intel MPI, Fails in Open MPI
>
> *External email: Use caution opening links or attachments*
> Nathan,
>
> The reproducer would be that code that's on the Intel website. That is
> what I was running. You could pull my image if you like but...since you are
> the genius:
>
> [root@adac3ce0cf32 ~]# mpirun --mca btl_vader_single_copy_mechanism none
> -np 2 ./a.out
>
> Rank 0 running on adac3ce0cf32
> Rank 1 running on adac3ce0cf32
> Rank 0 sets data in the shared memory: 00 01 02 03
> Rank 1 sets data in the shared memory: 10 11 12 13
> Rank 0 gets data from the shared memory: 10 11 12 13
> Rank 0 has new data in the shared memory: 00 01 02 03
> Rank 1 gets data from the shared memory: 00 01 02 03
> Rank 1 has new data in the shared memory: 10 11 12 13
>
> And knowing this led to: https://github.com/open-mpi/ompi/issues/4948
>
> So, good news is that setting export
> OMPI_MCA_btl_vader_single_copy_mechanism=none let's a lot of stuff work.
> The bad news is we seem to be using MPI_THREAD_MULTIPLE and it does not
> like it:
>
>     Start 2: pFIO_tests_mpi
>
> 2: Test command: /opt/openmpi-4.0.2/bin/mpiexec "-n" "18" "-oversubscribe"
> "/root/project/MAPL/build/bin/pfio_ctest_io.x" "-nc" "6" "-nsi" "6" "-nso"
> "6" "-ngo" "1" "-ngi" "1" "-v" "T,U" "-s" "mpi"
> 2: Test timeout computed to be: 1500
> 2:
> --------------------------------------------------------------------------
> 2: The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this
> release.
> 2: Workarounds are to run on a single node, or to use a system with an RDMA
> 2: capable network such as Infiniband.
> 2:
> --------------------------------------------------------------------------
> 2: [adac3ce0cf32:03619] *** An error occurred in MPI_Win_create
> 2: [adac3ce0cf32:03619] *** reported by process [270073857,16]
> 2: [adac3ce0cf32:03619] *** on communicator MPI COMMUNICATOR 4 DUP FROM 3
> 2: [adac3ce0cf32:03619] *** MPI_ERR_WIN: invalid window
> 2: [adac3ce0cf32:03619] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
> 2: [adac3ce0cf32:03619] ***    and potentially your MPI job)
> 2: [adac3ce0cf32:03587] 17 more processes have sent help message
> help-osc-pt2pt.txt / mpi-thread-multiple-not-supported
> 2: [adac3ce0cf32:03587] Set MCA parameter "orte_base_help_aggregate" to 0
> to see all help / error messages
> 2: [adac3ce0cf32:03587] 17 more processes have sent help message
> help-mpi-errors.txt / mpi_errors_are_fatal
> 2/5 Test #2: pFIO_tests_mpi ...................***Failed    0.18 sec
>
> 40% tests passed, 3 tests failed out of 5
>
> Total Test time (real) =   1.08 sec
>
> The following tests FAILED:
>           2 - pFIO_tests_mpi (Failed)
>           3 - pFIO_tests_simple (Failed)
>           4 - pFIO_tests_hybrid (Failed)
> Errors while running CTest
>
> The weird thing is, I *am* running on one node (it's all I have, I'm not
> fancy enough at AWS to try more yet) and ompi_info does mention
> MPI_THREAD_MULTIPLE:
>
> [root@adac3ce0cf32 build]# ompi_info | grep -i mult
>           Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support:
> yes, OMPI progress: no, ORTE progress: yes, Event lib: yes)
>
> Any ideas on this one?
>
> On Mon, Feb 24, 2020 at 7:24 PM Nathan Hjelm via users <
> users@lists.open-mpi.org> wrote:
>
> The error is from btl/vader. CMA is not functioning as expected. It might
> work if you set btl_vader_single_copy_mechanism=none
>
> Performance will suffer though. It would be worth understanding with
> process_readv is failing.
>
> Can you send a simple reproducer?
>
> -Nathan
>
> On Feb 24, 2020, at 2:59 PM, Gabriel, Edgar via users <
> users@lists.open-mpi.org> wrote:
>
> 
>
> I am not an expert for the one-sided code in Open MPI, I wanted to comment
> briefly on the potential MPI -IO related item. As far as I can see, the
> error message
>
>
>
> “Read -1, expected 48, errno = 1”
>
> does not stem from MPI I/O, at least not from the ompio library. What file
> system did you use for these tests?
>
>
>
> Thanks
>
> Edgar
>
>
>
> *From:* users <users-boun...@lists.open-mpi.org> *On Behalf Of *Matt
> Thompson via users
> *Sent:* Monday, February 24, 2020 1:20 PM
> *To:* users@lists.open-mpi.org
> *Cc:* Matt Thompson <fort...@gmail.com>
> *Subject:* [OMPI users] Help with One-Sided Communication: Works in Intel
> MPI, Fails in Open MPI
>
>
>
> All,
>
>
>
> My guess is this is a "I built Open MPI incorrectly" sort of issue, but
> I'm not sure how to fix it. Namely, I'm currently trying to get an MPI
> project's CI working on CircleCI using Open MPI to run some unit tests (on
> a single node, so need some oversubscribe). I can build everything just
> fine, but when I try to run, things just...blow up:
>
>
>
> [root@3796b115c961 build]# /opt/openmpi-4.0.2/bin/mpirun -np 18
> -oversubscribe /root/project/MAPL/build/bin/pfio_ctest_io.x -nc 6 -nsi 6
> -nso 6 -ngo 1 -ngi 1 -v T,U -s mpi
>  start app rank:           0
>  start app rank:           1
>  start app rank:           2
>  start app rank:           3
>  start app rank:           4
>  start app rank:           5
> [3796b115c961:03629] Read -1, expected 48, errno = 1
> [3796b115c961:03629] *** An error occurred in MPI_Get
> [3796b115c961:03629] *** reported by process [2144600065,12]
> [3796b115c961:03629] *** on win rdma window 5
> [3796b115c961:03629] *** MPI_ERR_OTHER: known error not in list
> [3796b115c961:03629] *** MPI_ERRORS_ARE_FATAL (processes in this win will
> now abort,
> [3796b115c961:03629] ***    and potentially your MPI job)
>
>
>
> I'm currently more concerned about the MPI_Get error, though I'm not sure
> what that "Read -1, expected 48, errno = 1" bit is about (MPI-IO error?).
> Now this code is fairly fancy MPI code, so I decided to try a simpler one.
> Searched the internet and found an example program here:
>
>
>
> https://software.intel.com/en-us/blogs/2014/08/06/one-sided-communication
>
>
>
> and when I build and run with Intel MPI it works:
>
>
>
> (1027)(master) $ mpirun -V
> Intel(R) MPI Library for Linux* OS, Version 2018 Update 4 Build 20180823
> (id: 18555)
> Copyright 2003-2018 Intel Corporation.
>
> (1028)(master) $ mpiicc rma_test.c
> (1029)(master) $ mpirun -np 2 ./a.out
> srun.slurm: cluster configuration lacks support for cpu binding
> Rank 0 running on borgj001
> Rank 1 running on borgj001
> Rank 0 sets data in the shared memory: 00 01 02 03
> Rank 1 sets data in the shared memory: 10 11 12 13
> Rank 0 gets data from the shared memory: 10 11 12 13
> Rank 1 gets data from the shared memory: 00 01 02 03
> Rank 0 has new data in the shared memory:Rank 1 has new data in the shared
> memory: 10 11 12 13
>  00 01 02 03
>
>
>
> So, I have some confidence it was written correctly. Now on the same
> system I try with Open MPI (building with gcc, not Intel C):
>
>
>
> (1032)(master) $ mpirun -V
> mpirun (Open MPI) 4.0.1
>
> Report bugs to http://www.open-mpi.org/community/help/
>
> (1033)(master) $ mpicc rma_test.c
> (1034)(master) $ mpirun -np 2 ./a.out
> Rank 0 running on borgj001
> Rank 1 running on borgj001
> Rank 0 sets data in the shared memory: 00 01 02 03
> Rank 1 sets data in the shared memory: 10 11 12 13
> [borgj001:22668] *** An error occurred in MPI_Get
> [borgj001:22668] *** reported by process [2514223105,1]
> [borgj001:22668] *** on win rdma window 3
> [borgj001:22668] *** MPI_ERR_RMA_RANGE: invalid RMA address range
> [borgj001:22668] *** MPI_ERRORS_ARE_FATAL (processes in this win will now
> abort,
> [borgj001:22668] ***    and potentially your MPI job)
> [borgj001:22642] 1 more process has sent help message help-mpi-errors.txt
> / mpi_errors_are_fatal
> [borgj001:22642] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> all help / error messages
>
>
>
> This is a similar failure to above. Any ideas what I might be doing wrong
> here? I don't doubt I'm missing something, but I'm not sure what. Open MPI
> was built pretty boringly:
>
>
>
> Configure command line: '--with-slurm' '--enable-shared'
> '--disable-wrapper-rpath' '--disable-wrapper-runpath'
> '--enable-mca-no-build=btl-usnic' '--prefix=...'
>
>
>
> And I'm not sure if we need those disable-wrapper bits anymore, but long
> ago we needed them, and so they've lived on in "how to build" READMEs until
> something breaks. This btl-usnic is a bit unknown to me (this was built by
> sysadmins on a cluster), but this is pretty close to how I build on my
> desktop and it has the same issue.
>
>
>
> Any ideas from the experts?
>
>
>
> --
>
> Matt Thompson
>
>    “The fact is, this is about us identifying what we do best and
>
>    finding more ways of doing less of it better” -- Director of Better
> Anna Rampton
>
>
>
> --
> Matt Thompson
>    “The fact is, this is about us identifying what we do best and
>    finding more ways of doing less of it better” -- Director of Better
> Anna Rampton
> ------------------------------
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information.  Any unauthorized review, use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message.
> ------------------------------
>


-- 
Matt Thompson
   “The fact is, this is about us identifying what we do best and
   finding more ways of doing less of it better” -- Director of Better Anna
Rampton

Reply via email to