On Sun, 28 Jun 2020, Paolo Lampitella wrote: > Dear PETSc users, > > I’ve been an happy PETSc user since version 3.3, using it both under Ubuntu > (from 14.04 up to 20.04) and CentOS (from 5 to 8). > > I use it as an optional component for a parallel Fortran code (that, BTW, > also uses metis) and, wherever allowed, I used to install myself MPI (both > MPICH and OpenMPI) and PETSc on top of it without any trouble ever (besides > being, myself, as dumb as one can be in this). > > I did this on top of gnu compilers and, less extensively, intel compilers, > both on a range of different systems (from virtual machines, to workstations > to actual clusters). > > So far so good. > > Today I find myself in the need of deploying my application to Windows 10 > users, which means giving them a folder with all the executables and > libraries to make them run in it, including the mpi runtime. Unfortunately, I > also have to rely on free tools (can’t afford Intel for the moment). > > To the best of my knowledge, considering also far from optimal solutions, my > options would then be: Virtual machines and WSL1, Cygwin, MSYS2-MinGW64, > Cross compiling with MinGW64 from within Linux, PGI + Visual Studio + Cygwin > (not sure about this one) > > I know this is largely unsupported, but I was wondering if there is, > nonetheless, some general (and more official) knowledge available on the > matter. What I tried so far: > > > 1. Virtual machines and WSL1: both work like a charm, just like in the > native OS, but very far from ideal for the distribution purpose > > > 1. Cygwin with gnu compilers (as opposed to using Intel and Visual > Studio): I was unable to compile myself MPI as I am used to on Linux, so I > just tried going all in and let PETSc do everything for me (using static > linking): download and install MPICH, BLAS, LAPACK, METIS and HYPRE. > Everything just worked (for now compiling and making trivial tests) and I am > able to use everything from within a cygwin terminal (even with executables > and dependencies outside cygwin). Still, even within cygwin, I can’t switch > to use, say, the cygwin ompi mpirun/mpiexec for an mpi program compiled with > PETSc mpich (things run but not as expected). Some troubles start when I try > to use cmd.exe (which I pictured as the more natural way to launch in > Windows). In particular, using (note that \ is in cmd.exe, / was used in > cygwin terminal):
I don't understand. Why build with MPICH - but use mpiexec from OpenMPI? If it is because you can easily redistribute OpenMPI - why not build PETSc with OpenMPI? You can't use Intel/MS-MPI from cygwin/gcc/gfortran Also - even-though --download-mpich works with cygwin/gcc - its no loner supported on windows [by MPICH group]. > > .\mpiexec.hydra.exe -np 8 .\my.exe > > Nothing happens unless I push Enter a second time. Things seem to work then, > but if I try to run a serial executable with the command above I get the > following errors (which, instead, doesn’t happen using the cygwin terminal): > > [proxy:0:0@Dell7540-Paolo] HYDU_sock_write (utils/sock/sock.c:286): write > error (No such process) > [proxy:0:0@Dell7540-Paolo] HYD_pmcd_pmip_control_cmd_cb > (pm/pmiserv/pmip_cb.c:935): unable to write to downstream stdin > [proxy:0:0@Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event > (tools/demux/demux_poll.c:76): callback returned error status > [proxy:0:0@Dell7540-Paolo] main (pm/pmiserv/pmip.c:206): demux engine error > waiting for event > [mpiexec@Dell7540-Paolo] control_cb (pm/pmiserv/pmiserv_cb.c:200): assert > (!closed) failed > [mpiexec@Dell7540-Paolo] HYDT_dmxu_poll_wait_for_event > (tools/demux/demux_poll.c:76): callback returned error status > [mpiexec@Dell7540-Paolo] HYD_pmci_wait_for_completion > (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event > [mpiexec@Dell7540-Paolo] main (ui/mpich/mpiexec.c:336): process manager error > waiting for completion > > Just for the sake of completeness, I also tried using the Intel and Microsoft > MPI redistributables, which might be more natural candidates, instead of the > petsc compiled version of the MPI runtime (and they are MPICH derivatives, > after all). But, running with: > > mpiexec -np 1 my.exe > > I get the following error with Intel: > > [cli_0]: write_line error; fd=440 buf=:cmd=init pmi_version=1 pmi_subversion=1 > : > system msg for write_line failure : Bad file descriptor > [cli_0]: Unable to write to PMI_fd > [cli_0]: write_line error; fd=440 buf=:cmd=get_appnum > : > system msg for write_line failure : Bad file descriptor > Fatal error in MPI_Init: Other MPI error, error stack: > MPIR_Init_thread(467): > MPID_Init(140).......: channel initialization failed > MPID_Init(421).......: PMI_Get_appnum returned -1 > [cli_0]: aborting job: > Fatal error in MPI_Init: Other MPI error, error stack: > MPIR_Init_thread(467): > MPID_Init(140).......: channel initialization failed > MPID_Init(421).......: PMI_Get_appnum returned -1 > > And the following error with MS-MPI: > > [unset]: unable to decode hostport from 44e5747b-d19e-4ea8-ac7a-ec2102cabb21 > Fatal error in MPI_Init: Other MPI error, error stack: > MPIR_Init_thread(467): > MPID_Init(140).......: channel initialization failed > MPID_Init(403).......: PMI_Init returned -1 > [unset]: aborting job: > Fatal error in MPI_Init: Other MPI error, error stack: > MPIR_Init_thread(467): > MPID_Init(140).......: channel initialization failed > MPID_Init(403).......: PMI_Init returned -1 > > independently from the number of processes, but more processes produce more > copies of this. However, both Intel and MS-MPI are able to run a serial > fortran executable built with cygwin. I think I made everything correctly and > adding -localhost didn’t help (actually, it caused more problems to the > interpretation of the cmd line arguments for mpiexec) > > > 1. Cygwin with MinGW64 compilers. Never managed to compile MPI, not even > trough PETSc. > > > > 1. MSYS2+MinGW64 compilers. I understood that MinGW is not well supported, > probably because of how it handles paths, but I wanted to give it a try, > because it should be more “native” and there seems to be relevant examples > out there that managed to do it. I first tried with the msys2 mpi > distribution, produced the .mod file out of the mpi.f90 file in the > distribution (I tried my best with different hacks from known limitations of > this file as also present in the official MS-MPI distribution) and tried with > my code without petsc, but it failed in compiling the code with some strange > MPI related error (argument mismatch between two unrelated MPI calls in the > code, which is non sense to me). In contrast, simple mpi tests (hello world > like) worked as expected. Then I decided to follow this: > > > > https://doc.freefem.org/introduction/installation.html#compilation-on-windows > > > > but the exact same type of error came up (MPI calls in my code were > different, but the error was the same). Trying again from scratch (i.e., > without all the things I did in the beginning to compile my code) the same > error came up in compiling some of the freefem dependencies (this time not > even mpi calls). > > > > As a side note, there seems to be an official effort in porting petsc to > msys2 (https://github.com/okhlybov/MINGW-packages/tree/whpc/mingw-w64-petsc), > but it didn’t get into the official packages yet, which I interpret as a > warning > > > > 1. Didn’t give a try to cross compiling with MinGw from Linux, as I tought > it couldn’t be any better than doing it from MSYS2 > 2. Didn’t try PGI as I actually didn’t know if I would then been able to > make PETSc work. > > So, here there are some questions I have with respect to where I stand now > and the points above: > > > * I haven’t seen the MSYS2-MinGw64 toolchain mentioned at all in > official documentation/discussions. Should I definitely abandon it (despite > someone mentioning it as working) because of known issues? I don't have experience with MSYS2-MinGw64, However Pierre does - and perhaps can comment on this. I don't know how things work on the fortran side. > * What about the PGI route? I don’t see it mentioned as well. I guess > it would require some work on win32fe Again - no experience here. > * For my Cygwin-GNU route (basically what is mentioned in PFLOTRAN > documentation), am I expected to then run from the cygwin terminal or should > the windows prompt work as well? Is the fact that I require a second Enter > hit and the mismanagement of serial executables the sign of something wrong > with the Windows prompt? I would think Cygwin-GNU route should work. I'll have to see if I can reproduce the issues you have. Satish > * More generally, is there some known working, albeit non official, > route given my constraints (free+fortran+windows+mpi+petsc)? > > Thanks for your attention and your great work on PETSc > > Best regards > > Paolo Lampitella >