Hello whoever reads this, I am running my code using CUDA aware OpenMPI (see ompi_info –all attached). First I will explain the problem, further down I will give additional info about versions, hardware and debugging.
The Problem: My application solves multiple mathematical equations on GPU via CUDA. Multi-GPU capabilities are enabled via CUDA-aware OpenMPI, where I send many chunks of data from sections of one simulation domain to a simple halo buffer on neighbouring processes (so between the domain partitions mapped to each MPI-process) For all kind of cases there are no issues at all. The basic installation seems fine and there seems to be no bug, invalid memory access or a similar thing in the code. (Which I also verified for the problem case by hand and via debuggers) The error I get is a SEGFAULT when calling many (below code) MPI_Isend and MPI_Irecv operations with a directly followed MPI_Waitall. The interesting part is all accessed data on the device-array is allocated and initialized (which I verified) It happens randomly at different operations in the loop but always at MPI_Isend. Additionally, I should note down that for the problem case each process handles per halo exchange and Waitall about 765 requests and it only happens starting from 6 processes (and GPUs) on which means about 4590 total requests on the node. It Fails at the first execution of the routine at a random part of the loop. The simplified code looks like this: ===================================== MPI_Request Requ[req_count]; MPI_Status Stat[req_count]; LoopAllTasks { … Redefine source, destination, tags and request numbers //Recieve halos backward MPI_Irecv(devPtr + recv_config.shift_right, recv_config.halo_right, MPI_FLOAT, src, tag, MPI_COMM_WORLD_, &Requ[req1]); // Send halo fowards MPI_Send(devPtr + send_config.shift_right, send_config.halo_right, MPI_FLOAT, dst, tag, MPI_COMM_WORLD_, &Requ[req2); … Redefine source, destination, tags and request numbers //Recieve halos forward MPI_Irecv(devPtr + recv_config.shift_left, recv_config.halo_left, MPI_FLOAT, src, tag, MPI_COMM_WORLD_, &Requ[req3); // Send halo backwards MPI_Isend(devPtr + send_config.shift_left, send_config.halo_left, MPI_FLOAT, dst, tag, MPI_COMM_WORLD_, &Requ[req4]); } MPI_Waitall(req_count, Requ, Stat); ===================================== What I found until know: Attached are outputs of valgrind, gdb and ompi_info –all (sorry for bad quality had to optimize the picture to fit into this mail) It doesn’t seem to have anything to do with the request array or the allocated device memory, as all accesses are in range. Also in the real code I use wrapper over the MPI_Isend and MPI_Irecv, which means any object passed is either a copy or a valid pointer. Valgrind (see attached file) identified it as an ‘invalid write of size 8’ while I only operate on floats and it only happens in MPI_Isend not MPI_Irecv. From gdb and cuda-gdb I could identify that it happens for small messages of 3-7 Bytes. Also, they are perfectly in range of allocated global memory. I couldn’t find any similar issue, also I couldn’t find any documentation on some internal limitations and how to change them. Simply reducing the number of requests by half, by using blocking MPI_Send instead of MPI_Isend, fixed the issue. But I would like to understand the underlying behaviour, so this is not an acceptable solution. The problem occurs both on a CentOS cluster (-node) using OpenMPI 3.0.0 + CUDA 11.0 with 6-8 dedicated GPUs (all GTX 1080Ti) and on a Ubuntu 20.04 machine with OpenMPI 4.1.0 and 4.1.2 + CUDA 11.2 and 11.4 with 2 dedicated GPUS (RTX 2080) and overloaded GPUs. I would be glad about any input, thanks a lot Alex [https://www.essteyr.com/wp-content/uploads/2020/02/pic-1_1568d80e-78e3-426f-85e8-4bf0051208351.png] Make the world a safer place together with [https://www.essteyr.com/wp-content/uploads/2022/01/210118_Dynairix_EmailBanner_Transparent_Resized.png]<https://www.dynairix.com/> <https://www.essteyr.com/event/1-worldwide-coatings-simulation-conference/> [https://www.essteyr.com/wp-content/uploads/2020/02/pic-1_1568d80e-78e3-426f-85e8-4bf0051208351.png] [https://www.essteyr.com/wp-content/uploads/2021/01/ESSSignatur3.png]<https://www.essteyr.com/> [https://www.essteyr.com/wp-content/uploads/2020/02/linkedin_38a91193-02cf-4df9-8e91-230f7459e9c3.png]<https://at.linkedin.com/company/ess-engineeringsoftwaresteyr> [https://www.essteyr.com/wp-content/uploads/2020/02/twitter_5fc7318f-c0e4-495c-b96c-ebd9cf186067.png] <https://twitter.com/essteyr> [https://www.essteyr.com/wp-content/uploads/2020/02/facebook_ee01289e-1a90-48d0-8e82-049bb3c3a46b.png] <https://www.facebook.com/essteyr> [https://www.essteyr.com/wp-content/uploads/2020/09/SocialLink_Instagram_32x32_ea55186d-8d0b-4f5e-a023-02e04995f5bf.png] <https://www.instagram.com/ess_engineering_software_steyr/> [cid:QR060599f8-b7ee-4d42-bc62-af37210a4223.png] DI Alexander Stadik Software Developer R&D Large Scale Solutions [https://www.essteyr.com/wp-content/uploads/2021/12/Unknown.png]<callto:alexander.sta...@essteyr.com> Phone: +4372522044622 Company: +43725220446 Mail: alexander.sta...@essteyr.com Register of Firms No.: FN 427703 a Commercial Court: District Court Steyr UID: ATU69213102 [https://www.essteyr.com/wp-content/uploads/2018/09/pic-2_f96fc865-57a5-4ef1-a924-add9b85d55cc1.png] ESS Engineering Software Steyr GmbH • Berggasse 35 • 4400 • Steyr • Austria [https://www.essteyr.com/wp-content/uploads/2018/09/pic-2_1df6b77f-61f1-40d3-a337-0145e62afb3e1.png] This message is confidential. It may also be privileged or otherwise protected by work product immunity or other legal rules. If you have received it by mistake, please let us know by e-mail reply and delete it from your system; you may not copy this message or disclose its contents to anyone. Please send us by fax any message containing deadlines as incoming e-mails are not screened for response deadlines. The integrity and security of this message cannot be guaranteed on the Internet. <https://www.essteyr.com/event/1-worldwide-coatings-simulation-conference/>
<<attachment: ompi_info.zip>>
<<attachment: ValgrindError.zip>>
<<attachment: GDB-Output.zip>>