Hi Jeff

As we say in french "dans le mille!" you were right.
I'm not the admin of these servers and a "mpirun not found" was
sufficient in my mind. It wasn't.

As I had deployed OpenMPI 4.0.2 I launch a new build after setting my
LD_LIBRARY_PATH to reach OpenMPI4.0.2 installed libs before all other
locations and all tests were successfull.

I think that this should be modified in the test script as we usually
run "make check" before "make install". Setting properly LD_LIBRARY_PATH
to reach first the temporary directory were the libs are built before
launching the test would be enought to avoid this wrong behavior.

I do not wait for an hour in front of my keyboard :-D, it was lunch time
and I was thinking of some timeout problem as NFS means... network!

Thanks a lot for providing the solution so quickly.

Patrick

Le 22/04/2020 à 20:17, Jeff Squyres (jsquyres) a écrit :
> The test should only take a few moments; no need to let it sit for a
> full hour.
>
> I have seen this kind of behavior before if you have an Open MPI
> installation in your PATH / LD_LIBRARY_PATH already, and then you
> invoke "make check".
>
> Because the libraries may be the same name and/or .so version numbers,
> there may be confusion in the tests setup scripts about exactly which
> libraries to use (the installed versions or the ones you just built /
> are trying to test).
>
> This is a long way of saying: make sure that you have no other Open
> MPI installation findable in your PATH / LD_LIBRARY_PATH and then try
> running `make check` again.
>
>
>> On Apr 21, 2020, at 2:37 PM, Patrick Bégou via users
>> <users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> wrote:
>>
>> Hi OpenMPI maintainers,
>>
>>
>> I have temporary access to servers with AMD Epyc processors running
>> RHEL7.
>>
>> I'm trying to deploy OpenMPI with several setup but each time "make
>> check" fails on *opal_path_nfs*. This test freeze for ever consuming
>> no cpu resources.
>>
>> After nearly one hour I have killed the process.
>>
>> *_In test-suite.log I have:_*
>>
>> ====================================================================
>>    Open MPI v3.1.x-201810100324-c8e9819: test/util/test-suite.log
>> ====================================================================
>>
>> # TOTAL: 3
>> # PASS:  2
>> # SKIP:  0
>> # XFAIL: 0
>> # FAIL:  1
>> # XPASS: 0
>> # ERROR: 0
>>
>> .. contents:: :depth: 2
>>
>> FAIL: opal_path_nfs
>> ===================
>>
>> FAIL opal_path_nfs (exit status: 137)
>>
>>
>> _*In opal_path_nfs.out I have a list of path:*_
>>
>> /proc proc
>> /sys sysfs
>> /dev devtmpfs
>> /run tmpfs
>> / xfs
>> /sys/kernel/security securityfs
>> /dev/shm tmpfs
>> /dev/pts devpts
>> /sys/fs/cgroup tmpfs
>> /sys/fs/cgroup/systemd cgroup
>> /sys/fs/pstore pstore
>> /sys/firmware/efi/efivars efivarfs
>> /sys/fs/cgroup/hugetlb cgroup
>> /sys/fs/cgroup/pids cgroup
>> /sys/fs/cgroup/net_cls,net_prio cgroup
>> /sys/fs/cgroup/devices cgroup
>> /sys/fs/cgroup/cpu,cpuacct cgroup
>> /sys/fs/cgroup/freezer cgroup
>> /sys/fs/cgroup/perf_event cgroup
>> /sys/fs/cgroup/cpuset cgroup
>> /sys/fs/cgroup/memory cgroup
>> /sys/fs/cgroup/blkio cgroup
>> /proc/sys/fs/binfmt_misc autofs
>> /sys/kernel/debug debugfs
>> /dev/hugepages hugetlbfs
>> /dev/mqueue mqueue
>> /sys/kernel/config configfs
>> /proc/sys/fs/binfmt_misc binfmt_misc
>> /boot/efi vfat
>> /local xfs
>> /var xfs
>> /tmp xfs
>> /var/lib/nfs/rpc_pipefs rpc_pipefs
>> /home nfs
>> /cm/shared nfs
>> /scratch nfs
>> /run/user/1013 tmpfs
>> /run/user/1010 tmpfs
>> /run/user/1046 tmpfs
>> /run/user/1015 tmpfs
>> /run/user/1121 tmpfs
>> /run/user/1113 tmpfs
>> /run/user/1126 tmpfs
>> /run/user/1002 tmpfs
>> /run/user/1130 tmpfs
>> /run/user/1004 tmpfs
>>
>> _*In opal_path_nfs.log:*_
>>
>> FAIL opal_path_nfs (exit status: 137)
>>
>>
>> The compiler is GCC9.2.
>>
>> I've also tested openmpi-4.0.3 built with gcc 8.2. Same problem.
>>
>> Thanks for your help.
>>
>> Patrick
>>
>>
>
>
> -- 
> Jeff Squyres
> jsquy...@cisco.com <mailto:jsquy...@cisco.com>
>

Reply via email to