Hi David,
could you specify which version of OpenMPI you are using ?
I've also some parallel I/O trouble with one code but still have not
investigated.
Thanks
Patrick
Le 13/04/2020 à 17:11, Dong-In Kang via users a écrit :
>
> Thank you for your suggestion.
> I am more concerned about the poor
I have also this problem on servers I'm benching at DELL's lab with
OpenMPI-4.0.3. I've tried a new build of OpenMPI with "--with-pmi2". No
change.
Finally my work around in the slurm script was to launch my code with
mpirun. As mpirun was only finding one slot per nodes I have used
Hi OpenMPI maintainers,
I have temporary access to servers with AMD Epyc processors running RHEL7.
I'm trying to deploy OpenMPI with several setup but each time "make
check" fails on *opal_path_nfs*. This test freeze for ever consuming no
cpu resources.
After nearly one hour I have killed the
of saying: make sure that you have no other Open
> MPI installation findable in your PATH / LD_LIBRARY_PATH and then try
> running `make check` again.
>
>
>> On Apr 21, 2020, at 2:37 PM, Patrick Bégou via users
>> mailto:users@lists.open-mpi.org>> wrote:
>>
>> Hi Ope
Hi all,
I've built OpenMPI 4.3.0 with GCC 9.3.0 but on the server ucx was not
available when I set --with-ucx. I remove this option and it compiles
fine without ucx. However I have a strange behavior as when using mpirun
I must explicitely remove ucx to avoid an error: in my module file I
have to
Le 08/05/2020 à 21:56, Prentice Bisbal via users a écrit :
>
> We often get the following errors when more than one job runs on the
> same compute node. We are using Slurm with OpenMPI. The IB cards are
> QLogic using PSM:
>
> 10698ipath_userinit: assign_context command failed: Network is down
>
so the problem was not critical for my futur.
Patrick
>
> On Sun, 26 Apr 2020 at 18:09, Patrick Bégou via users
> mailto:users@lists.open-mpi.org>> wrote:
>
> I have also this problem on servers I'm benching at DELL's lab with
> OpenMPI-4.0.3. I've tr
-and-i-o/fabric-products/OFED_Host_Software_UserGuide_G91902_06.pdf#page72>
>
> TS is the same hardware as the old QLogic QDR HCAs so the manual might
> be helpful to you in the future.
>
> Sent from my iPad
>
>> On May 9, 2020, at 9:52 AM, Patrick Bégou via users
>> wro
Hi Ha Chi
do you use a batch scheduler with Rocks Cluster or do you log on the
node with ssh ?
If ssh, can you check that you can ssh from one node to the other
without password ?
Ping just says the network is alive, not that you can connect.
Patrick
Le 04/06/2020 à 09:06, Hà Chi Nguyễn Nhật
> the mpirun --version in all 3 nodes are identical, openmpi 2.1.1, and
> same place when testing with "whereis mpirun"
> So is there any problem with mpirun causing it to not launch to other
> nodes?
>
> Regards
> HaChi
>
> On Thu, 4 Jun 2020 at 14:35, Patric
but deeper in the code I think.
Patrick
Le 04/12/2020 à 19:20, George Bosilca a écrit :
> On Fri, Dec 4, 2020 at 2:33 AM Patrick Bégou via users
> mailto:users@lists.open-mpi.org>> wrote:
>
> Hi George and Gilles,
>
> Thanks George for your suggestion. Is it valua
refore be used to convert back into a
>> valid datatype pointer, until OMPI completely releases the datatype.
>> Look into the ompi_datatype_f_to_c_table table to see the datatypes
>> that exist and get their pointers, and then use these pointers as
>> arguments to ompi_datatype
Hi,
I'm trying to solve a memory leak since my new implementation of
communications based on MPI_AllToAllW and MPI_type_Create_SubArray
calls. Arrays of SubArray types are created/destroyed at each time step
and used for communications.
On my laptop the code runs fine (running for 15000
;
> Sharing a reproducer will be very much appreciated in order to improve ompio
>
> Cheers,
>
> Gilles
>
> On Thu, Dec 3, 2020 at 6:05 PM Patrick Bégou via users
> wrote:
>> Thanks Gilles,
>>
>> this is the solution.
>> I will set OMPI_MCA_io=^ompio aut
ntil OMPI completely releases the datatype.
>> Look into the ompi_datatype_f_to_c_table table to see the datatypes
>> that exist and get their pointers, and then use these pointers as
>> arguments to ompi_datatype_dump() to see if any of these existing
>> datatypes are the ones you d
efore be used to convert back
>>> into a valid datatype pointer, until OMPI completely releases the
>>> datatype. Look into the ompi_datatype_f_to_c_table table to see the
>>> datatypes that exist and get their pointers, and then use these
>>> pointers as arguments to ompi_da
; are used?
>
>
> mpirun --mca pml_base_verbose 10 --mca mtl_base_verbose 10 --mca
> btl_base_verbose 10 ...
>
> will point you to the component(s) used.
>
> The output is pretty verbose, so feel free to compress and post it if
> you cannot decipher it
>
>
> Cheers,
>
>
Hi,
I'm using an old (but required by the codes) version of hdf5 (1.8.12) in
parallel mode in 2 fortran applications. It relies on MPI/IO. The
storage is NFS mounted on the nodes of a small cluster.
With OpenMPI 1.7 it runs fine but using modern OpenMPI 3.1 or 4.0.5 the
I/Os are 10x to 100x
.
>
>
> You can force romio with
>
> mpirun --mca io ^ompio ...
>
>
> Cheers,
>
>
> Gilles
>
> On 12/3/2020 4:20 PM, Patrick Bégou via users wrote:
>> Hi,
>>
>> I'm using an old (but required by the codes) version of hdf5 (1.8.12) in
>>
a try?
>
>
> Cheers,
>
>
> Gilles
>
>
>
> On 12/7/2020 6:15 PM, Patrick Bégou via users wrote:
>> Hi,
>>
>> I've written a small piece of code to show the problem. Based on my
>> application but 2D and using integers arrays for testing.
>
und.
> At first glance, I did not spot any issue in the current code.
> It turned out that the memory leak disappeared when doing things
> differently
>
> Cheers,
>
> Gilles
>
> On Mon, Dec 14, 2020 at 7:11 PM Patrick Bégou via users
> mailto:users@lists.open-mpi.org&
Issue #8290 reported.
Thanks all for your help and the workaround provided.
Patrick
Le 14/12/2020 à 17:40, Jeff Squyres (jsquyres) a écrit :
> Yes, opening an issue would be great -- thanks!
>
>
>> On Dec 14, 2020, at 11:32 AM, Patrick Bégou via users
>> mailto:users@lists
22 matches
Mail list logo